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EXECUTIVE  SUMMARY 

Evaluations  of  reliability,  maintainability,  and  availability 
(RMA)  of  large-scale  complex  systems  have  received  a  great  deal  of 
attention  in  defense  and  commercial  fields.  In  these  studies,  an 
extremely  difficult  yet  critical  issue  is  effectiveness  of  the 
model . 

Experience  in  RMA  anlaysis  for  many  practical  large-scale 
systems  has  shown  that  more  than  50%  of  BlT-related  maitenance 
actions  are  due  to  false  alarms.  This  clearly  implies  an  exces¬ 
sive  operation  and  support  (O&S)  costs.  Further,  to  improve 
system  availability,  one  often  employs  redundant  components  (or 
modules) .  Redundancy  not  only  increases  hardware  costs  but 
imposes  additional  difficulties  on  analyzing  system  RMA  as  it 
increases  modeling  complexity,  especially  for  large-scale  systems. 

Analytical  models  for  such  systems  that  provide  an  accurate 
picture  yet  are  not  too  complicated  are  very  difficult  to  find. 
Simulations,  though  can  be  made  very  accurate,  could  often  be 
costly.  On  the  other  hand,  analytical  models  are  very  efficient 
for  sensitivity  analysis  and  numerous  tradeoff  studies,  provided 
that  they  are  accurate. 

Two  important  ingredients  must  be  taken  into  account  in  set¬ 
ting  up  models  for  RMA  analysis,  i.e.,  conditions  and  activities 


(events)  of  the  system.  In  fact,  for  large-scale  systems,  numbers 
of  conditions  and  activities  often  become  intractably  large.  It  is 
this  problem  that  has  prevented  most  currently  available  schemes 
from  providing  accurate  and  effective  RMA  analysis. 
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In  this  report ,  we  propose fto  use  generalized  stochastic  Petri 
nets  (GSPN)  in  RMA  studies.  The  novelty  of  this  modeling  approach 
lies  on  the  ground  of  the  following  distinctive  reasons.’ 

(i)  The  GSPN  offers  a  precise  description  of  system  activ¬ 
ities  and  conditions  while  involves  less  complexity,  comparing  to 
other  modeling  techniques.  Specifically,  it  is  an  inherently 
effective  bookkeeping  for  conditions  and  activities. 

(it)  It  provides  a  clairvoyant  insight  of  the  key  parameters 
that  affect  RMA  analysis.  Causes  and  results  of  events  can  be 

easily  tracked  by  executing  the  GSPN.  J 

J 

(iii)  It  takes  the  advantage  of  the  existence  of  concurrency 
and  timing  of  events,  thus  describes  accurately  the  sequence  of 
events.  / 

The  report  is  divided  into  two  parts.  In  the  first  part, 
definitions  and  classifications  of  concurrent  tasks  are  given. 
Existing  analytical  models  which  are  based  on  queueing  networks 
(QN)  are  reviewed.  Approximate  hierarchical  models  based  on  GSPN 
and  QN  are  both  presented.  In  the  second  part,  analysis  of  GSPN 
is  considered.  Techniques  for  reducing  analytical  complexity  such 
as  reduction  and  aggregation  of  GSPN  are  introduced.  Applicaitons 
of  such  techniques  to  the  approximate  hierarchical  decomposition 
of  stochastic  Petri  nets  are  discussed.  In  addition,  approximate 
lumping  of  synchronous  parallel  operations  is  considered.  For 
simplicity  of  discussions,  many  of  these  results  are  illustrated 
via  performance  evaluation  of  computer  systems.  Finally,  examples 
of  RMA  analysis  and  fault  detection  and  isolation  are  given  using 
the  developed  techniques. 
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INTRODUCTION 

1.1  Motivation 

Advances  in  solid-state  technology  have  provided  us  with  high 
speed  computer  systems  of  ever  increasing  computational  power.  In 
addition  to  the  speed  of  the  components,  their  organization  may 
be  a  limiting  factor.  Design  efforts  have  therefore  been  geared 
toward  improving  performance  on  the  system  level.  Parallelism  in 
the  architecture  has  been  the  most  successful  approach.  It 
includes  multiple  functional  units,  pipelining  array  structures, 
•and  multiprocessor  architectures.  Distributed  computing  and 
network  configuration  represent  parallelism  on  an  even  higher 
level.  Design  efforts  also  focus  on  operating  system  functions 
and  strategies  for  managing  system  resources,  and  further 
improvements  can  yet  be  obtained  by  designing  programming 
languages  features  that  match  the  underlying  architecture.  Common 
to  all  these  design  efforts  is  the  desire  to  evaluate  perfomance 
impacts  prior  to  implementation.  Even  though  the  basic  components 
of  such  systems  are  inexpensive,  the  design  costs  are  so  high 
that  an  incorrect  design  which  is  undetected  until  late  in  the 
development  process,  can  have  a  serious  negative  impact  on  a 
company.  Therefore,  cost  effective  tools  for  performance 
prediction  of  such  system  ,  at  the  early  stage  of  design,  are  of 
vital  importance. 

Simulation  models,  though  could  be  made  very  accurate,  are 
not  cost-effective..  Therefore,  they  are  not  adequate  at  the 
early  stages  of  design  when  the  design  space  is  very  large. 
Simulations  are  most  valuable  however  when  detailed  evaluations 


ace  required  at  the  final  stage  of  design. 

Analytical  models  ace  cost  effective  because  they  are  based 
on  efficient  solutions  to  mathematical  equations.  However,  in 
order  for  these  equations  to  have  a  tractable  .solution,  certain 
simplifying  assumptions  must  be  made  regarding  the  structure  and 
behaviour  of  the  model.  As  a  result,  analytical  models  cannot 
capture  all  the  details  that  can  be  built  into  simulation  models. 
Nevertheless,  an  analytical  model  can  provide  insight  into  the 
key  factors  affecting  performance  of  a  proposed  system,  and 
determine  the  sensitivity  of  performance  to  parameter  changes. 
Such  a  model  can  provide  guidance  into  the  overall  design  of  the 
system  and  also  be  useful  in  the  development  of  more  detailed 
simulation  models  as  the  design  matures. 

1.2  Definition  of  Parallel  Processing: 

Parallel  processing,  in  contrast  to  sequential  processing,  is 
a  cost-effective  means  to  improve  performance  through  concurrent 
activities  in  the  computer.  Parallel  processing,  can  formally  be 
defined  as  follows  [KA1  34], 

Definition  :  Parallel  processing  is  an  efficient  form  of 
information  processing  which  emphasizes  the  exploitation  of 
concurrent  events  in  the  computing  process  of  a  job.  Concurrency 
implies  parallelism,  simultaneouty,  and  pipelining.  Parallel 
events  may  occur  in  multiple  resources  during  the  same  time 
interval  ;  simultaneous  events  may  occur  at  the  same  instant  and 
pipelined  events  may  occur  in  overlapped  time  spans. 

Concurrent  events  in  the  processing  of  a  job  are  attainable 
at  various  levels.  These  levels  are  summarized  as  follows  : 
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1-  Task  or  procedure  level. 

2-  Interinstruction  level. 

3-  Intrainstruction  level. 

The  first  level  is  conducted  among  procedures  or  tasks 
(program  segment).  This  involves  the  decomposition  of  a  program 
into  multiple  tasks  which  may  be  processed  concurrently.  The 
second  level  is  to  exploit  concurrency  among  multiple 
instructions.  Data  dependency  analysis  is  often  performed  to 
reveal  parallelism  among  instructions.  Vector ization  may  be 
desired  among  scalar  operations  within  DO  loops.  Finally,  in  the 
third  level,  concurrent  operations  within  each  instruction  can  be 
exploited.  The  highest  level  is  often  conducted  algorithmically 
and  will  be  discussed  further  in  chapter  2.  The  lower  level  is 
implemented  directly  by  hardware. 

Parallel  computers  are  those  systems  that  emphasize  parallel 
processing.  Such  systems  are  categorized  as  follows  : 

1-  Pipeline  computers  :  such  systems  perform  overlapped 
computations  to  exploit  temporal  parallelism.  They  are  more 
attractive  for  vector  processing,  where  component  operations  may 
be  repeated  many  times. 

2-  Array  Computers  :  an  array  processor  is  a  synchronous 
parallel  computer  with  multiple  arithmatic  logic  unit  called 
processing  element  (PE).  The  PEs  are  synchronized  to  perform  the 
same  function  in  the  same  time. 

3-  Multiprocessor  Systems  ;  consist  of  two  or  more  processors 
of  comparable  capabilities  that  operate  asynchronously.  All 
processors  share  access  to  common  sets  of  memory  modules,  10 


channels,  and  peripheral  devices.  Each  processor  has  its  own 
local  memory  and  private  devices.  The  entire  system  is  controlled 
by  a  single  operating  system  providing  interactions  between 
processors  and  their  programs  as  various  levels. 

Clearly,  pipeline  and  array  computers  exploit  parallelism  at 
the  inter  and  intra  instruction  level  whereas  parallel  processing 
at  the  task  level  is  adequate  in  a  multiprocessor  system.  Our 
primary  goal  in  this  work  is  to  develop  analytical  models  for 
parallel  processing  in  a  multiprocessor  environment. 

1,3  Analytical  performance  modeling  : 

Computer  systems  can  be  generally  characterized  as  consisting 
of  a  set  of  hardware  resources  (e.g.  processors,  channels,  disks, 
etc...)  and  a  set  of  tasks,  or  jobs,  competing  for  and  accessing 
those  resources.  Because  there  are  mutiple  jobs  competing  for  a 
limited  number  of  resources,  queues  for  the  resources  are 
inevitable  and  with  these  queues  come  delays.  It  is,  then, 
natural  to  model  the  system  by  a  network  of  interconnected 
queues. The  purpose  of  the  model  is  to  predict  the  performance  of 
the  system  by  estimating  cha r ac ta r i s t i cs  of  the  resource 
utilization,  the  queue  lengths,  and  the  queueing  delays. 


Therefore,  analytic  models  of  computer  systems  have  been  solely 
based  on  queueing  network  (QN)  models. 

Research  in  performance  modelling  methodology  has  essentially 
been  research  in  queueing  theory.  Key  advances  in  computer 
performance  modeling  have  also  been  seen  as  fundamental 
breakthroughs  in  queueing  theory.  Queueing  Theory  has  attained 
new  relevance  because  of  the  computer  performance  modelling 
application.  Furthermore,  to  a  great  extent,  the  direction  of 


to  a 


eat  extent,  the  direction  of 


queue i ng  theory  has  been  influenced  and  driven  by  this 
application  (HED  34]. 

Figurel.l  shows  the  famous  QN  model  of  multiprogramming 
systems,  the  so  called  central  server  model  [BUZ  71].  It  was 
introduced  to  model  contention  among  programs  for  processors  and 
10  devices.  The  model  is  a  QN  consisting  of  a  J  service  centers 
and  a  population  of  N  active  jobs  (the  multiprogramming  level). 
Service  center  (SC)  1,  represents  the  processor  and  service 
center  j;  j*2,...,J  represents  an  10  device.  Each  job  is  assumed 
to  reside  in  main  memory,  and  goes  through  a  number  of  CPU-10 
cycles;  it  executes  on  the  CPU,  performs  10  on  one  of  the  10 
devices,  and  returns  to  the  CPU,  repeating  this  process  until  it 
is  terminated.  The  termination  of  a  program  and  initiation  of  a 
new  program  is  represented  by  a  job  re-entering  service  center  1 
having  completed  service  from  that  service  center.  In  order  to 
completely  define  the  model,  the  following  must  be  specified, 

1)  The  queueing  disciplines  at  each  one  of  the  centers. 

2)  The  service  requirement  of  jobs  at  the  centers. 

3)  The  routing  probabilities  of  jobs  between  centers. 

When  the  above  are  appropriately  defined,  the  evolution  of 
the  network  can  be  represented  by  a  continous  time  Markov  chain 
(MC) ,  the  sate  of  which  is  defined  by  the  number  of  jobs  at  each 
3C.  However,  as  N  and  J  increase,  the  state  space  of  this  MC 
becomes  unmanageably  large. 

For  a  restricted  class  of  networks  called  product  form 
networks  (KLI  75,  CH  31],  several  computationally  efficient 
analysis  algorithms  have  been  developed  (CH  81, RSI  S3],  The 
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existence  of  such  algorithms  for  a  broad  class  of  models  makes 
analytic  queueing  models  an  attractive  tool  for  applied 
performance  modelling  studies  of  computer  systems.  The  above 
central  server  model  has  been  used  in  several  performance 
prediction  studies  (e.g.  the  VM/370  performance  predictor  [BARD 
77/78] )  . 

Product  form  QNs,  however,  are  not  suitable  for  modelling 
parallel  processing  [CH  81,  HID  84].  In  chapter  2,  models  of 
parallel  computations  will  be  discussed.  A  classif ication  of 
parallel  programs  based  on  such  models  will  also  be  discussed.  In 
chapter  3,  current  analytical  models  of  parallel  processing 
systems  will  be  briefly  described.  In  chapter  4,  models  of 
parallel  processing  systems  using  the  generalized  stochastic 
petri  nets  (GS?N)  will  be  presented.  In  chapters  5,  6,  and.  7,  the 
analysis  techniques  for  such  networks  will  be  developed.  Finally 
in  chapter  8,  analytical  models  for  systems  reliability, 
maintainability  /availability,  and  fault  diagnosis,  using  GSPNs, 


are  considered. 
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CHAPTER  2 

MODELS  OF  PARALLEL  PROGRAMS 
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2.1  Introduction 

A  parallel  program  consists  of  several  cooperating  concurrent 
tasks  that  can  be  executed  in  parallel.  The  terms  task  and 
process  are  intended  to  mean  a  self  contained  portion  of  a 
computation  that  once  initiated  can  be  carried  out  to  its 
completion.  The  completion  of  a  task  is  significant  in  that  its 
occurrence  can  initiate  the  execution  of  another  set  of  tasks. 

The  problem  of  defining  parallel  programs  received  much* 
attention  in  the  literature.  Two  approaches  were  followed,  one  is 
to  have  explicit  concurrency,  by  which  the  programmer  specifies 
the  concurrency  using  certain  language  constructs.  Conway  [con63] 
proposed  a  FORK  and  JOIN  statements.  FORK  spawns  a  new  concurrent 
process,  and  JOIN  waits  for  a  previously  created  process  to 
terminate.  Dijkstra  [DIJ68]  proposed  a  block  structure  language, 
vhich  defines  concurrent  tasks  by  using  the  constructs  parbegin 
and  parend.  For  example  in  the  following  program  segment,  the 
computations  for  matrices  A  and  S  are  to  be  carried  out  in 
parallel . 

begin 

initialize; 

parbegin 

compute  matrix  A; 
compute  matrix  B; 
parend 
C  »  A*B ; 
end 


Several  general  purpose  high  level  languages  have 
incorporated  these  concepts  in  their  definitions  (  PL/I,  ALGOL- 


60,  concurrent  PASCAL,  ADA,..). 

The  second  approach  is  to  have  implicit  Concurrency.  In  this 
case  the  compiler  determines  what  can  be  executed  in  parallel 
[BAER731. 

In  section  2,  graph  models  of  parallel  computations  will  be 
described  [KAR  66,  ADAM 70,  CER72,  BEA77,  PET83,  M0L81]  .  These 
models  were  developed  to  facilate  the  design  of  parallel  programs 
and  deal  with  the  issues  of  correctnes  and  efficiency.  In  section 
3,  classifications  of  parallel  programs  and  algorithms  will  be 
discussed . 

2.2  Models  of  Parallel  Computations 
2.2.1  Computation  Graphs 

Karp,  Miller  ( KAR66 ] ,  and  Adams  [ADAM70]  have  developed 
models  for  parallel  computations,  in  which  the  sequencing  control 
is  governed  by  the  flow  of  data.  A  directed  graph  was  used  to 
represent  the  computation.  The  nodes  of  the  graph  represent 
computation  steps,  which  can  range  from  a  single  operation  to  a 
complex  computational  task.  An  edge  in  the  graph  can  be  thought 
of  as  a  queue  of  data  produced  by  one  node  and  waiting  to  be 
consumed  by  another.  A  computation  step  may  be  initiated  whenever 
each  edge  directed  into  that  node  of  the  graph  contains  the 
amount  of  data  required  for  that  node  to  execute  properly.  The 
number  of  computation  steps  which  may  be  executed  at  any  given 
time  is  dynamically  determined  by  the  flow  of  data.  Thus 
unnecessary  sequencing  constraints  may  be  eliminated. 

The  properties  of  the  model  with  which  Karp  was  particularly 
concerned  are  :  1-  to  prove  that  the  model  is  determinate, i.e, 
for  a  given  input,  the  program  will  yield  a  unique  output 
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independent  of  the  relative  processor  speeds;  2-  a  test  to 
determine  whether  a  given  computation  will  indeed  terminate;  3-  a 
procedure  for  finding  the  number  of  performances  of  each 
computation  step;  and  4-  the  amount  of  temporary  storage  required 
for  the  data  queues  associated  with  the  branches  (edges)  of  the 
graph,  together  with  the  conditions  for  the  queue  length  to 
remain  bounded.  The  weakness  in  this  model,  however,  is  that 
data-dependent  conditional  transfer  cannot  be  taken  into  account, 
since  the  logic  at  the  nodes  corresponds  to  AND-input-AND-output 
logic,  i.e,  the  computation  is  started  when  enough  data  exist  on 
all  input  edges,  and  the  output  data  is  placed  on  all  output 
edges . 

The  model  described  by  Adams  was  an  attempt  to  provide  a 
framework  within  which  various  classes  of  computations  can  be 
represented.  The  model  has  been  developed  so  that  computations 
represented  within  it  will  be  determinate.  The  model  also  deals 
with  data  structures  ,  and  a  hierarchical  description  of  the 
program.  Data  structures  were  treated  with  generality  and  include 
the  hardware  defined  structures  such  as  bits  and  words,  and 
structures  usually  defined  in  a  programming  language  such  as 
arrays,  strings,  and  lists.  The  hierarchical  program  description 
was  achieved  by  being  able  to  treat  each  node  in  the  graph  as 
representing  operation  perhaps  very  complex,  and  also  being  able 
to  represent  as  a  graph  the  suboperations  or  instructions  of 
which  it  is  constructed.  Moreover,  data-dependent  conditional 
transfers  can  be  accomodated.  This  is  achieved  by  dividing  the 
nodes  of  the  graph  into  two  types,  computational  (r-nodes) ,  which 


saps  data  on  the  incoming  edges  to  data  on  the  outgoing  edges, 
and  computational  and  logical  (s-nodes),  which  also  map  edge 
status  as  locked  oc  unlocked,  an  edge  with  a  locked  status  is 
treated  by  successor  nodes  as  empty  (contains  no  data). 

2.2.2  Control  graphs 

Control  graphs  [CER72,  BEA77 )  are  bilogic  directed  graphs. 

The  arcs  contain  non-negative  number  of  tokens.  Logic  expressions 
are  assigned  to  the  set  of  input  arcs  and  to  the  set  of  output 
arcs  for  each  node  in  the  graph.  The  expressions  are  made  of 
"and's"  (*)  and  "or's"  (  +  ).  Computation  is  simulated  by  the 
movement  of  tokens  from  arcs  through  nodes  to  arcs.  Formally  a 
control  graph  is  defined  as  follows: 

B  =  (G , L , Q )  ,  where  G=(W,U)  is  a  directed  graph  with 

’«’■  l , ... , wn }  is  the  set  of  nodes,  and  U  is  the  set  of  ordered 
pairs  or  arcs  u^*  (w i , w  j ) .  There  is  a  unique  entry  arc  with  w^fl. 
£<*(L~,L+)  being  the  logic  conditions  (L“  is  the  input  and  L+  is 
the  output  logic.  Thus  with  each  node  w^  is  associated  one  of  the 
ordered  pairs  (*,+),(  +  ,*),(  +  ,+),(*,*).  If  L"  =  *,  wt  is  said  to  be 
of  AMD-input  logic  (respectively  OR  if  L“=+),  and  similarly  if 
*  w  £  is  said  to  be  of  AMD-output  logic  (respectively  OR). 
Finally  Qa(Q’“,Q‘t’)  are  the  (input, output)  token  value 
specifications  which  map  WXU  into  the  set  of  positive  integers  M. 

The  initiation  of  a  computation  modeled  by  w^  can  proceed 
when  L”=*  (respectively  L“a  +  )  only  if  for  each  (at  least  one) 
incident  arc  a  there  is  at  least  Q(w^,a)  tokens  on  it.  Figure  2 
shows  an  example  of  a  control  graph  which  could  be  a  model  for 
che  following  segment  of  a  program: 
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Repeat 

node  1;  parbegin 

begin  action  at  node  2  ..  end 
begin  action  at  node  3  ..  end 
parend 

Until  condition  at  node  4; 

The  main  imputence  behind  control  graph  studies  is  to  show 
that  the  graphs  are  terminating  properly,  i.e.,that  they 
represent  correct  and  terminating  programs  from  the  flow  of 
control  viewpoint. 

2.2.3  Standard  Petri  Nets 

In  this  section  a  simple,  yet  very  powerful  graph  model  of 
behaviour  will  be  presented. 

2.2.3. 1  Petri  Net  Structure 

3ef inition  1:  A  Petri  net  is  a  bipartite  directed  graph 
PN  *  ( X , A) ,  where 

1-  X  *  P  U  T  is  a  finite  set  of  nodes  with  ...,pnI  being  a 

set  of  places,  and  T=* ( t^,..., tm)  being  a  set  of  transitions,  such 
:hat  P  n  T  =  3. 

2-  A  =  I  U  0  is  a  finite  set  of  directed  arcs  with 
I:  PXT  — *■  B  are  the  input  arcs,  and 

0:  TXP  — *-3  are  the  output  arcs,  where  3  »  {0,1}. 

Inthe  sequal,  a  Petri  Net  will  be  referred  to  by  the  four 
tuple  PN  =  (P,T,I,0).  And  the  functions  I  and  0  will  be  called 
the  input  and  output  functions.  A  place  Pj_  £  P  such  that 
I(p^,tj)  >  3  (0(tj,p^)  >  0)  is  called  an  input  place  (output 
place)  of  transition  tj  £  T. 

Figure  2.2  shows  a  Petri  Net  (PN),  with  places  drawn  as 
circles  and  transitions  drawn  as  bars. 
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P  *  lPl/P2'P3'P4'P5'?6* 

T  *  {t1#t2/t3/t4/t5/t6} 
elements  of  I  and  0  >  0  are 

1  (pl^tl) ’l (?2' / 1 (P3 * t3)  , 

I(p4#t4) /I /I (pg/tg) t 

^P6'fc6^  »°(tl»P2^  »0(t^/P3) , 

0(t2»P4) ,0(t3,pg) ,0(t4/P6) , 
and  0  ( tg  /  p ^ )  . 

2.2. 3. 2  Petri  Net  Marking 

A  marking  is  an  assignment  of  tokens  to  places  of- a  PN. 
Tokens  can  be  thought  to  reside  in  the  places,  and  the  number  and 
position  of  tokens  may  change  during  the  execution  of  a  PN.  The 
tokens  are  used  to  define  the  execution  of  a  PN. 

Definition  2:  A  marking  M  of  a  Petri  Net  PN  *  (P,T,I,0)  is  a 
function  from  the  set  of  places  to  the  nonnegative  integers  N, 
i  .e. ,  M:  P - >  N. 

The  marking  M  can  also  be  defined  as  an  n  vector 

M  »  (m^, . ,m n) ,  where  is  the  number  of  tokens  in  pj, 

i3l,...,n.  The  definitions  of  a  marking  as  a  function  and  as  a 
vector  are  obviously  related  by  M(p^)  *  m^. 

2. 2. 3. 3  Execution  rules  for  Petri  Nets 

The  execution  of  a  PN  is  controlled  by  the  distribution  of 
tokens  in  the  PN  places.  A  place  holding  one  or  more  tokens  is 
said  to  be  full.  A  PN  executes  by  firing  transitions.  A 
transition  is  firable  (enabled)  if  all  of  its  input  places  are 
full. 

Definition  3:  A  transition  tj  £  ?  in  a  marked  PN  »  (?,T,I,0)  with 
marking  M  is  enabled  if  and  only  if  for  all  p^  £  P, 
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I(Pi/tj)  <  M(Pi) 

The  firing  of  a  transition  generates  a  new  marking  by 
removing  tokens  from  the  input  places  and  adding  tokens  to  the 
output  places. 

Definition  4:  A  transition  tj  in  a  marked  PN  with  marking  M  may 
fire  whenever  it  is  enabled.  Firing  an  enabled  transition  tj 
results  in  a  new  marking  M'  defined  by 
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M'(Pi)  *  M (pt )  -  I (pt / 1 j )  +  0(tj,Pi)  ,  V  Pi  6  P  (2.2.1) 

M'  is  said  to  be  immediatly  reachable  from  M. 

A  more  general  definition  of  a  PN  can  be  obtained  by 
assigning  weights  to  the  input  and  output  directed  arcs  between 
places  and  transition. 

Definition  5:  a  Petri  Net  is  the  four  tuple,  PM  *  (P,T,I,0), 
where  the  input  and  output  functions  I  and  0  now  are  defined  as 

I:  PXT - »  N  ,  and  0:  TXP - ►  M 

In  this  case,  following  the  above  definitions  for  enabling 
and  firing  of  transitions,  a  transition  is  enabled  if  and  only  if 
its  input  places  are  full,  and  each  input  place  holds  as  many 
tokens  as  the  weight  of  the  arc  linking  it  to  the  transition. 
Moreover,  the  firing  of  a  transition  generates  a  new  marking  by 
removing  tokens  from  the  input  places  and  adding  tokens  to  the 
output  places  according  to  the  weights  of  the  input  and  output 
arcs.  In  a  PM  graph  a  weight  label  is  added  to  each  directed  arc 
'the  weight  may  not  be  indicated  if  it  is  1). 

The  state  of  a  PM  is  defined  by  its  marking.  The  firing  of  a 
transition  represents  a  change  in  the  state  of  a  PM.  The  state 
space  of  a  PM  with  n  places  is  the  set  of  all  markings,  that  is. 
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Nn.  The  change  in  state  caused  by  firing  a  transition  is  defined 
by  a  change  function  called  the  next  state  function  f. 

Definition  6:  The  next  state  function  f:  NnXT - Nn  for  a 

?N  *  (P,T,I,0)  with  marking  M  and  for  tj  £  T  is  defined  if  and 
only  if  M(p^)  >  l(p£#tj)  for  all  6  P  (i.e.,tj  is  enabled  in 
M)  .  If  f(M,tj)  is  defined,  then  f(M,tj)  *  M'  ,  where  M'  is 
defined  as  in  (2.2.1). 

Given  a  PN  *  (P,T,I,0)  and  an  initial  marking  Ml,  we  can 
execute  the  PN  by  successive  transition  firings.  Firing  an 
enabled  transition  tj  in  the  initial  marking  produces  a  new 
narking  M2  =  f(Ml,tj)»  In  this  new  marking  we  can  fire  any  new 
enabled  transition,  say,  t^,  resulting  in  a  new  marking  M3  = 
f(M2,tjc)  -  This  can  continue  as  long  as  there  is  at  least  one 
enabled  transition  in  each  marking.  If  a  marking  is  reached  where 
no  transition  is  enabled,  then  no  transition  can  fire,  the 
function  f  is  undefined  for  all  transitions,  and  the  execution 
oust  halt. 

Two  sequences  result  from  the  execution  of  a  PN:  the  sequence 
of  markings  (Mi, M2, M3, ....),  and  the  sequence  of  transitions 
which  were  fired  (t  j  t  j  £,  ••••)  •  These  two  sequences  are  related 
:y  the  relationship  f(Mk,tjjc)  *  Mk  +  1  ,  k  *  1,2,3,.... 

The  set  of  all  reachable  markings  from  the  initial  marking  is 
:3lled  the  reachability  set  S. 

:ef inition  7:  The  reachability  set  S  for  a  marked  PN  »  (P,T,I,C) 
with  initial  marking  Ml,  is  the  smallest  set  of  markings  defined 
:•/,  1)  Ml  £  S,  2)  if  M’  £  S  and  M"  =  fd’,^)  for  some  ti  £  T, 
then  M"  6  3. 

A  transition  t :  is  live,  if  for  all  markings  M ' £  3,  there 


exists  an  execution  sequence  which  reaches  a  marking  M"  where  t^ 
is  firable.  A  PN  is  live  if  all  its  transitions  are  live.  A  PN  is 
said  to  be  k-safe  (k-bounded)  if  a  place  cannot  hold  more  than  k 
tokens  at  any  time,  i.e.,  if  M(pp  <  k  for  all  p^  £  P  and  all  Mfc 
S.  A  PN  is  said  to  be  safe  if  k  a  1. 

2.2. 3. 4  Modeling  with  Petri  Nets 

Petri  nets  were  used  to  model  various  types  of  systems,  where 
places  represent  conditions  and  transitions  represent  events. 
Hence  a  full  place  shows  the  holding  of  a  condition,  and  when  all 
conditions  prior  to  an  event  are  holding,  then  the  event  can 
occur  (a  transition  is  enabled).  PN  models  allow  all  possible 
states  of  a  system  to  be  examined,  so  that  it  can  be  determined 
whether  sequences  of  events  leading  to  undesirable  conditions 
exist  (e.g.  deadlock  conditions). 

In  modeling  parallel  computations,  the  firing  of  transitions 
in  a  PN  represents  the  execution  of  computations,  while  tokens  in 
places  represent  the  conditions  under  which  computations  can  take 
place.  A  computation  sequence  follows  the  execution  sequence  of 
transitions.  It  has  been  found  by  Gostlow  [G0S71],  and  Peterson 
[?E?74,80]  that  the  control  graphs  defined  above  and  PNs 
computation  sequences  were  in  the  same  theoretical  class  of 
formal  models.  Figure  2.3  shows  the ’mapping  of  control  graphs  to 
?IIs. Therefore  figure  2.2  is  the  equivalent  ?N  model  of  the  control 
graph  in  figure  2.1  [B2A77,MOL81 ] . 


Ramchandani  (RAM 74 ]  has  extended  the  standard  PNs  to  include 
a  measure  of  time  to  what  is  called  Timed  PNs.  The  basic  idea  of 
the  extension  was  to  simply  add  a  label  to  each  transition  which 
indicated  how  long  that  transition  takes  to  fire  (which 
represents  the  computation  time).  These  time  values  are  fixed 
(deterministic).  Molloy  [MOL82],  and  others,  introduced  what  is 
called  Stochastic  PN  ( S P M ) <  where  transition  firing  times  are 
exponentially  distributed  random  variables.  In  this  case 
transitions  are  characterized  with  their  firing  rates,  which  can 
be  marking  (or  state)  dependent.  This  extension  was  significant 
in  the  sense  that  it  defines  a  non-deterministic  model  that  can 
be  analyzed  by  Markov  chains  (MC).  A  formal  definition  of  the 
3?N  is  thus  the  following: 

SPN= (P,T, I ,0, R) ,  where  P,T,I,  and  0  are  defined  as  above,  and 
R*(rj_,...,rtn}  is  the  set  of  firing  rates  associated  with 
transi cions . 

The  problem  with  analyzing  SPNs,  however,  is  that  the  number 
of  states  of  the  associated  MC  grows  very  fast  with  the 
dimensions  of  the  graph. 

Marsan  et  al  [MAR83],  have  extended  the  SPNs  to  the  so  called 
Generalized  SPNs  (GSPNs).  GSPNs  are  obtained  by  allowing 
transitions  to  belong  to  two  different  classes:  immediate 
transitions  and  timed  transitions.  Immediate  transitions  fire  in 
zero  time  once  they  are  enabled,  while  timed  transitions  behave 
like  in  SPNs.  A  formal  definition  of  a  GSPN  is  thus  as  for  SPN, 
where  now  the  set  R  contains  only  m'  elements,  m'  being  the 
number  of  timed  transitions.  The  significance  of  this  extension 


is  due  to  the  fact  that  the  operating  sequence  of  a  system 
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comprises  activities  whose  duration  differ  for  orders  of 
magnitude.  It  is  then  conceivable  to  model  the  short  activities 
only  from  the  logical  point  of  view,  whereas  time  is  associated 
with  the  longer  ones.  This  choice  becomes  particularly  convenient 
if  by  doing  so  the  number  of  states  of  the  associated  MC  is 
reduced,  hence  reducing  the  solution  complexity.  Figure  2.4  shows 
an  example  of  a  GSPN  (immediate  transitions  are  drawn  as  double 
bars).  This  model  is  the  same  as  the  one  in  Figure  2.2,  except 
that  times  for  the  activities  of  synchronizing  tasks  2  and  3  as 
well  as  the  conditional  transfer  at  node  4  are  neglected.  The 
analysis  of  the  GSPN  will  be  considered  in  more  details  later. 

2.3.  Classifications  of  Parallel  Programs 

Using  the  above  models  of  computations,  Herzog  et  al  (HER79] 
classified  the  structure  of  a  variety  of  application  programs 
into  four  types  as  follows: 

1-  Type-1  program  structure ( f igure  2.5  (a)):  The  program 

consists  of  a  loop  which  may  be  passed  several  times.  This  loop 
consists  of  a  primary  task  S0,  upon  completion  of  which  n 
independent  concurrent  tasks  are  spawned.  A  new  loop  may  be 
started  if  and  only  if  all  n  tasks  are  completed.  Problems  of 
this  type  are,  algorithms  for  the  solution  of  linear-algebric  or 
partial  differential  equations,  optimization  procedures, 
simulations  including  subruns  for  the  purpose  of  estimating 
confidence  intervals,  and  problems  of  picture  processing. 

2-  Type-2  program  structured igure  2.5(b)):  Here,  the  program 
also  consists  of  a  loop.  However,  the  n  concurrent  tasks 


influence  each  other  in  some  way  rather  than  being  completely 
independent. 

Tasks  interact  at  some  points  called  interaction  points.  These 
points  divide  the  tasks  into  stages  (subtasks).  At  the  end  of 
each  stage  a  task  communicates  with  some  other  tasks  before  the 
next  stage  of  computation  is  initiated.  Compared  to  type-1#  there 
are  not  only  global  but  also  local  synchronization  necessary. 

3-  Type-3  program  structure:  Here  the  degree  of  parallelism 
varies  rather  than  being  constant.  An  example  is  shown  in  figure 
2.3  (c)  . 

4-  Type-4  program  structure:  These  program  structures  consist 
of  completely  independent  tasks  S^,....,Sn  that  execute 
concurrently  with  the  primary  task  S0  and  terminate 
independently.  Here,  no  synchronization  is  necessary  since  the 
tasks  are  completely  independent  (figure  2.5  (d)). 


Kung  [KUN76,  KUN80]  classified  algorithms  for  multiprocessors 
as  synchronous  and  asynchronous.  In  a  synchronized  algorithm 
(type-1  and  type-2),  the  program  is  decomposed  into  tasks  which 
are  synchronized  at  interaction  points.  At  these  points  tasks  are 
blocked  while  waiting  for  inputs  from  others.  The  loss  due  to 
waiting  was  characterized  by  a  penalty  factor  defined  as  follows: 
Suppose  that  we  want  to  synchronize  K  identical  tasks,  and  that 
the  time  taken  by  the  ith  task  is  a  random  variable  t^  Since  the 
tasks  are  all  identical,  t^,...,^  are  identically  distributed 
random  variables  with  mean,  say,  t.  The  expected  time  taken  until 
all  of  the  tasks  are  completed  is  the  mean  T  of  the  random 
variable  T*max (t^,...,^)  rather  than  t.  In  general,  ? 
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(c)  Type-  3 
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Figure  2.5  Classification  of  Parallel  Programs 
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than  t.  The  penalty  factor  for  synchronizing  the  k  tasks  is  then 
defined  as  the  ratio  T/t.  Clearly  if  the  penalty  factor  is  large, 
then  the  performance  of  the  synchronized  algorithm  is  largely 
degraded.  Baudet  [BAU76]  has  observed  that,  if  the  t^'s  are 
identical  and  independent  exponentially  distributed  random 
variables,  then  the  penalty  factor  for  k  tasks  is  the  Kth 
harmonic  number  H^.  Note  that  grows  like  In  k  as  k  increases. 
Hence  synchronized  algorithms  should  be  used  when  there  are  only 
few  tasks  to  be  synchronized.  Futhermore,  the  execution  time  of 
the  needed  synchronization  primitives  is  usually  non-negligible. 
Thus,  it  is  not  always  advantageous  to  create  as  many  concurrent 
tasks  as  possible  according  to  the  maximal  decomposition  of  a 
problem. 

Asynchronous  parallel  algorithms  (type-4  program  structure), 
consist  several  concurrent  asynchronous  tasks.  Communication 
between  these  tasks  is  acheived  through  a  set  of  global  variables 
or  shared  data.  The  main  characteristic  of  these  tasks  is  that 
they  never  wait  for  the  completion  of  others  at  any  time,  but 
continue  or  terminate  according  to  whatever  information  is 
currently  contained  in  the  global  variables.  However,  to  insure 
logic  correctness,  the  operation  on  global  variables  are 
programmed  as  critical  sections.  This  asynchronous  behaviour 
leads  to  serious  issues  regarding  the  correctness  and  efficiency 
of  an  algorithm.  The  correctness  issue  arises  because  during  the 
execution  of  algorithm  operations  from  different  tasks  may 
interleave  in  an  unpredictable  manner.  The  efficiency  arises 
because  any  synchronization  introduced  for  correctness  reasons 
takes  extra  time  and  also  reduces  concurrency.  Kung  examined 
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various  techniques  for  dealing  with  these  issues,  and  also  showed 
examples  for  synchronous  as  well  as  asynchronous  implementations 
of  zero  searching  and  iterative  algorithms. 


CHAPTER  3 

CURRENT  MODELS  OF  PARALLEL  PROCESSING  SYSTEMS 

In  this  chapter,  current  analytical  models  proposed  in  the 
literature  for  parallel  processing  systems  will  be  discussed  [MAE 
76, PET  75, PRI  75, TOW  75, TOW  78, BAR  79,HEI  82,HEI  83, TOM  84 J. 

3.1  Models  for  CPU: 10  Overlap. 

The  models  developed  in  [MAE  76, PET  75, PRI  75, TOW  75, TOW  78J 
were  primarily  intended  to  model  CPU:IO  overlap  using  double  or 
multiple  buffering.  This  means  that  a  program  issues  two  or  more 
concurrent  requests  on  distinct  system  resources.  In  the 
following  paragraph,  we  describe  briefly  the  most  recent  of  such 
models  developed  by  Towsley  [TOW  78]. 

The  model  developed  in  [TOW  78]  is  based  on  the  central 
server  QN  model.  The  assumption  normally  made  in  this  model  is 
that  a  job  alternates  between  CPU  and  10  activities.  The  job  may 
be  thought  of  as  repeating  cycles,  where  each  cycle  consists  of 
two  tasks  :  a  task  requiring  the  use  of  CPU  followed  by  one 
requiring  the  use  of  an  10.  In  an  overlap  system  (figure  3.1),  a 
cycle  may  consist  of  three  tasks;  CPU^  followed  by  the  concurrent 


tasks  CPU 2  and  10.  The  approximate  aggregation  technique  known  in 


QMs  as  Norton's  Theorem  was  used  to  obtain  a  network  with  two 


queues;  the  CPU  queue  and  an  aggregate  10  queue.  The  aggregate 
network  under  the  overlaped  job  cycle  in  figure  1  using  the  exact 
recursive  analysis  of  Markov  models  of  two  queue  networks 
developed  by  Herzog,  Woo,  and  Chandy  [HER  75]. 


The  accuracy  and  validity  of  the  model  have  been  verified 


p  against  detailed  simulations.  This  model  can  also  be  extended  to 
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aodel  overlap  in  CPU:CPU  and  IO:ro  activities.  However,  such 
concurrent  tasks  use  only  one  resource  each  before  they 
synchronize  and  merge. 

Models  of  systems  with  programs  defined  as  a  set  of 
concurrent  tasks,  where  each  task  requires  multiple  accesses  to 
aany  different  system  resources  before  it  either  terminates 
independently  or  communicates  with  other  tasks,  have  been 
developed  in  (HED  82,HED  83, TOM  84].  The  remaining  of  this 
chapter  will  be  devoted  to  the  description  of  such  models. 

3.2  Models  for  Asynchronous  Tasks: 

The  model  developed  in  [HED  82]  does  not  account  for  any 
synchronyza t ion  between  tasks.  It  assumes  a  system  workload 
consisting  of  a  set  of  statistically  identical  jobs.  Each  job 
consists  of  a  primary  task  (labeled  1)  and  zero  or  more 
statistically  identical  secondary  tasks  (labeled  2)  .  The 

secondary  tasks  are  spawned  by  the  primary  task  sometime  during 
its  execution  and  execute  concurrently  with  it,  competing  for 
systems  resources.  A  secondary  t3Sk  is  otherwise  assumed  to  run 
and  terminate  independently.  An  approximate  multi-chain  QN  modal 
was  developed  with  two  chains;  one  closed  and  the  other  open.  The 
closed  chain  models  the  execution  of  primary  tasks  in  the  system 
and  a  specially  defined  node  (node  0)  was  defined  such  that  a 
secondary  task  is  spawned  whenever  a  primary  task  enters  this 
node.  The  open  chain  which  shares  the  same  systems  resources  with 
the  closed  chain,  models  the  execution  of  secondary  tasks  which 
compete  for  the  system  resources  with  the  primary  task,  and  leave 
the  network  when  they  terminate.  The  arrival  rate  of  the  open 
chain  is  set  equal  to  the  throughput  of  primary  tasks, at  node  0 
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.of  the  closed  chain.  The  aerival  process  of  primary  tasks  at 
node  0  is  approximately  assumed  to  be  Poisson  which  is 
independent  of  the  state  of  the  network.  Since  the  throughput  of 
the  closed  chain  is  itself  a  nonlinear  function  of  the  arrival 
rate  of  the  open  chain,  a  closed  form  solution  is  not  available. 
An  iterative  algorithm  was  used  to  solve  this  nonlinear  equation, 
and  the  conditions  for  its  convergence  were  given.  The  accuracy 
of  the  approximation  was  studied  through  comparison  with 
simulation . 

To  illustrate  the  above  model,  consider  the  two-chain  QN 
chain  shown  in  figure  3.2. 

Let  N  ^number  of  primary  tasks, 

r2aarrival  rate  of  the  open  chain 
r i ^throughput  of  task  j  at  queue  i;  i=S,l,2,  and 

3*1,2,. 

vi2*mean  number  of  visits  a  secondary  task  makes  to  queue 
i  (open  chain) 

ui2*uti 1 ization  of  server  at  queue  i  due  to  secondary 

tasks . 

Q^j^mean  queuing  time  of  task  j  at  queue  i. 

L^j^mean  queue  length  of  task  at  queue  i. 

S^j*tnean  service  time  of  task  j  at  server  i. 
f ^probability  that  a  primary  task  will  go  to  node  0. 

*pcobab  i  1  i  ty  that  a  secondary  task  return  to  Qj.  for 
some  more  service. 

At  steady  state, 

v12“l+  p2  v12'  v12*l/(l-P2) 


^Then  ,  r^2»  r2  v12 

£ 

u12*  r2  v12  s12'  and 
?  u22*  r2  v12  s22 

Then  using  the  MVA  algorithm  for  mixed  QNs  [REI  83],  we  have, 

i'1 

v-  QU(N)  «  sil[l+  Ln(M-l)]/(l-a12) 

.v  Q21W  -  s21[l+  L21(N-l)]/(l-u22) 

ClL-  N/[  Q11(N)+(l-f)Q21(N)] 

—  r01*f  C11  •  and  r2l“  (1“f } rll 

Lil(N)-  ru(N)  QU(N)  ;  i-1,2 
From  the  above, 

£  '2^)*rgi»fM/{ [ (s11/(l-r2 (N) v12  s12) ) (1+L11 (N-l) ] + 

[  (  (1-f)  s22/  a-t2v12s22) )  (1+L21  (N-l) )  ]  } 
the  above  nonlinear  aquation  is  solved  iteratively  for  r2,  such 
||  that  r2vi2si2  <  1  ; 

If  this  condition  is  not  satisfied,  then  a  stable  solution 
•'  cannot  be  obtained,  which  means  that  primary  tasks  are  able  to 
f?  generate  more  secondary  tasks  than  the  system  can  handle,  and  the 
naan  queue  length  of  secondary  tasks  at  at  least  one  queue  will 
■y  be  infinite. 

3.3  A  Model  for  Synchronous  Tasks: 

In  [HED  83],  another  model  was  developed  for  synchronous 
f  tasks.  It  assumes  a  workload  consisting  of  a  set  of  statistically 
identical  jobs.  Each  job  consists  of  a  primary  task  and  a  fixed 
y  number  of  synchronous  concurrent  secondary  tasks.  The  system 
f’  again  consists  of  a  finite  number  of  servers  including  processors 
and  10  devices,  a  particular  pseudo  server  labeled  3,  and  a 

*  • 

■>  f . 

*-•  finite  number  N  of  jobs.  Each  primary  task  of  a  job  executes  on  a 
P  sequence  of  servers  and  whenever  it  enters  server  3,  it  splits 


into  a  fixed  number  (>®2)  of  secondary  tasks.  Each  primary  task 
is  the  parent  of  its  secondary  tasks  and  later  are  said  to  be 
siblings.  Each  secondary  task  executes  on  a  sequence  of  servers. 
A  secondary  task  is  considered  complete  upon  entering  node  0.  At 
node  0,  the  secondary  task  must  wait  until  all  of  its  siblings 
have  completed  execution,  at  which  time  the  primary  task  becomes 
active  again  and  the  process  is  repeated.  Syncronization  between 
secondary  tasks  is  achieved  by  requiring  all  siblings  to  complete 
execution  before  the  job  can  continue  processing.  Two 
approximations  were  proposed  to  come  up  with  a  tractable  solution 
for  the  above  model.  Both  approximations  are  based  on  solving  a 
set  of  related  multi-chain  closed  QNs. 

The  first  method  is  based  on  decomposition  approximation, 
following  the  decomposition  approach  of  Courtois  [COO  77].  For 
purposes  of  illustration,  we  restrict  our  discussion  to  the  case 
in  which  a  primary  task  subdivides  into  two  secondary  tasks.  Each 
of  the  N  jobs  may  either  have  its  primary  task  (labeled  0)  or 
have  one  or  both  of  its  secondary  tasks  (labeled  1  and  2)  active. 
At  time  t  ,  let  a ^  ( t }  denote  the  number  of  active,i.e.  executing, 
tasks  of  type  i,  and  let  w^(t)  denote  the  number  of  i  secondary 
tasks  which  have  completed  execution  and  are  waiting  for  their 
respective  sibling  to  complete  ;  i  ■  1 , 2 .  Let  a  ( t )  = 
(30  ( t)  ,a^(t) ,  a  2  ( t) )  and  w  (t) a  (W]_  ( t)  ,W2  (t) ) .  Then  at  time  t 
,N*a0 (t) +ai (t) +Wi (t)  for  i=l,2. 

If  the  primary  and  secondary  tasks  are  relatively  long  in  the 
sense  that  many  servers  are  visited  before  the  task,  then  changes 
in  a(t)  will  occur  frequently  as  compared  to  changes  in  the  queue 
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lengths  of  tasks  at  the  servers.  In  this  case,  it  is  reasonable 
to  assume  that  queue  length  distributions  converge  to  steady 
state  distributions  prior  to  the  next  change  in  a(t).  Therefore, 
for  every  feasible  multiprogramming  level(mpl)  a* (83,3^,32)  #  a 
closed  three-chain  QN,  with  population  33,3^,32  respectively  is 
solved  for  the  performance  parameters  (throughputs,  utilization  , 
and  mean  queue  lengths).  Let  r^(a)  denote  the  throughput  of  type 
i  tasks  at  node  0  when  the  mpl  is  a.  The  process  {a(t),  t>*0}  is 
then  modeled  as  a  finite  state  Markov  chain  with  state  space  E 
and  transition  rate  matrix  Q.  The  state  space  E  is  a  subset  of 
{0<ag<N,  0<a1<N-ag,0<a2<N-a0}.  The  exact  definition  of  E  however 
is  quite  complex.  Another  assumption  was  made  to  facilitate  the 
state  space  definition  by  keeping  track  of  only  the  mpl  without 
specifying  the  identity  of  the  waiting  tasks.  Let  pi (a)  be  the 
probability  that  a  type  i  secondary  task  that  has  just  completed 
finds  that  its  sibling  has  also  completed,  when  the  mpl  is  a. 
These  probabilities  are  defined  as  follows, 


p^a)  »  w2/a1  i£  al>0 


?2 (3)  *  wi/*2  if  a2>0 


if  a i *0 


if  a2a0 


Then  the  state  space  E  is  given  by, 

E»{  a:  0<  ag<N  ,  0<  a^OJ-ag,  0<  a2<N-ag,  N<ag  +  a1  +  a2),  and  the 
off  diagonal  elements  of  the  rate  transition  matrix  Q  (q(a,b)) 
are  listed  below  (q(a,a)  »  -  ^s/,-Q  q(a,b)). 


Vs 


b 


transition  explanation 


a 

(a3-l,al  +  l,a2+l)  r0(a)  task  0  completed,  tasks  1,2  spawned 

(afl,al-l,a2)  rl (a)  (1-pl (a) )  task  1  completed,  sibling  active 

(a0,al,a2-l)  r2 (a) (l-p2 (a) )  task  2  completed,  sibling  active 

(a0+l,al-l,a2)  rl(a)  pl(a)  task  1  completed,  sibling  waiting 

(a0*l,al,a2-l)  r2(a)  p2(a)  task  2  completed,  sibling  waiting 

The  stationary  distribution  P  of  this  Markov  chain  can  be 
obtained  from  solving  PQ»0,  such  that  the  elements  of  P  sums  to 
one.  The  global  performance  parameters  can  then  be  obtained, 
a.g.,  the  job  throughput  r  is  given  by, 

r*  ^a  £  E  p<a)  r0<a) 

The  second  approximate  method  proposed  is  based  on  the  method 
of  complementary  delays  and  is  similar  to  the  one  proposed  in 
[IAC  81]  for  simultaneous  resource  possession.  It  consists  of 
iteratively  solving  a  sequence  of  product  form  (pf)  QNs.  The 
synchronization  delay  experienced  by  a  task  was  modeled  by  an 
infinite  server  queue.  The  mean  synchronization  delays  are 
estimated  by  assuming  the  tasks  response  times  to  the 
independent,  exponentially  distributed  random  variables.  The  mean 
tasks  response  times  are  obtained  from  the  solution  to  the  pf  QN 
at  the  previous  iteration.  However,  the  decomposition 
approximation  was  found  to  be  clearly  superior  to  the  above 
method.  The  former  was  found  to  be  remarkably  accurate,  when 
compared  to  simulation  results,  predicting  throughputs  and  device 
utilizations  with  a  mean  relative  error  of  one  percent. 


|J?.4  A  Model  for  a  Task  System: 

Thomasian  et  A1  [TOM  83]  developed  a  QN  model  based  on  the 

1  8 

"Shove  decomposition  approximation.  The  model  assumes  that  there 
Exists  only  one  active  job  in  the  system  (monoprogramming),  which 
is  defined  by  a  set  of  tasks  T» {T^fT2,...,Tn}  with  a  partial 
taisrder  defined  on  T  specifying  deterministic  precedence 
'/.constraints.  Only  a  directed  acyclic  graphs  were  considered. 

from  such  a  graph,  a  MC  is  constructed,  the  state  of  which 
^defines  the  active  task  combination.  A  closed  multi-class  QN  is 
./Solved  for  each  state  of  the  MC. 
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CHAPTER  4 

MODELING  PARALLEL  PROCESSING  SYSTEMS 
USING  THE  GENERALIZED  STOCHASTIC  PETRI  NETS 

« 

4.1  Introduction 

In  the  previous  chapter  current  models  for  parallel 
processing  were  discussed.  Such  models  were  based  on  analytical 

queuing  networks  (QNs).  And  due  to  the  fact  that  analytical  QN 

models  become  intractable  when  modeling  parallel  operations, 

several  approximate  models  were  developed.  Even  though  the 

accuracy  of  such  models  were  found  to  be  adequate,  they  were 

restricted  to  a  workload  consisting  of  statistically  identical 

jobs  with  one  type  of  parallelism  or  another.  For  example,  the 

model  in  section  3.2  assumes  only  asynchronous  concurrent  tasks. 

In  section  3.3,  the  model  was  developed  for  jobs  with  fixed 

number  of  synchronous  tasks.  And  in  section  3.4,  the  model  is 

restricted  to  one  active  job  consisting  of  a  fixed  number  of 

tasks  with  deterministic  precedence  relation. 

From  the  classification  of  parallel  programs  given  in  chapter 
2,  it  is  clear  that  the  active  jobs  in  the  system  may  consist  of 
both  synchronous  and  asynchronous  tasks.  Moreover,  a 
probabilistic  model  is  needed  to  represent  a  wide  variety  of 
active  job  structures. 

In  this  chapter,  a  different  analytical  modeling  tool, 
namely,  the  Generalized  Stochastic  Petri  Nets  (GSPNs) ,  is  used  to 
model  activities  in  a  parallel  processing  system.  Such  activities 
include  parallel  operations,  synchronization,  contention  for 
resources,  and  queuing. 
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GSPNs  are  a  very  versatile  modeling  tool.  The  probabilistic 
nature  of  GSPNs  allows  systems  operations  to  be  described  in  a 
high  level  of  abstraction.  As  indicated  in  chapter  2,  Petri  Nets 
(PNs)  are  very  powerful  in  modeling  parallel  activities  and 
synchronization.  A  distinctive  feature  of  GSPNs,  with  respect  to 
standard  PNs  models,  is  their  isomorphism  with  markovian  models, 
which  allows  evaluation  of  the  performance  of  a  system.  However, 
GSPN  models  eliminate  a  major  difficulty  in  the  direct 
construction  of  a  Markov  chain,  that  is,  its  state  space 
definition.  Also  GSPN  models  retain  much  of  the  characteristics 
of  the  system,  therfore  they  provide  greater  insight  into  the 
various  system  activities. 

In  section  2,  a  more  detailed  description  of  GSPNs  will  be 
given.  The  analysis  of  GSPNs,  however,  will  be  deferred  to  the 
next  chapter.  In  section  3,  modeling  parallel  processing  systems 
using  GSPNs  will  be  considered.  And  in  section  4,  an  approximate 
hierarchical  model  for  large  scale  systems,  using  both  tractable 
QNs  and  GSPNs,  will  be  presented. 

4.2  Description  of  GSPNs 

Recall  the  definition  of  a  Stochastic  Petri  Net  (SPN) 
[M0L81],  which  consists  of  a  set  of  places  P,  a  set  of 
transitions  T,  the  input  output  functions  I  and  0,  and  the  set  of 
transition  firing  rates  R. 

SPN  »  (P,T, I ,0, R) , 

where,  P  *  {pl,p2,«.«,pn},  T  *  {tl,t2,..«, tm } , 

IsPXT  - >  N  ,  0:TXP  — *  N  ,  and  R  *  ( rl , r2, . . . , rm} 

For  a  given  initial  marking  Ml,  the  reachability  set  S  can  be 
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obtained  using  the  same  analysis  performed  on  standard  PNs. 

Once  enabled/  a  transition  ti  £  T  takes  an  exponentially 

Jdistr ibuted  random  time,  with  rate  r^,  to  fire.  In  a  GSPN 

*»*« 

transition  firing  rates  can  be  infinite.  Therefore  the  set  T  is 
£  partitioned  into  a  set  of  timed  transitions  with  finite  firing 
rr  rates  defined  in  the  set  R,  and  a  set  of  immediate  transitions. 

Clearly,  for  any  marking  in  S  at  which  several  timed  transitions, 
v  and  one  immediate  transition,  are  enabled,  the  immediate 
transition  fires  with  probability  one.  However,  if  several 
immediate  transitions  are  simultaneously  enabled  at  a  marking, 
£  then  it  is  necessary  to  define  a  probability  density  function  on 

c 

the  set  of  enabled  immediate  transitions  according  to  which  the 
firing  immediate  transition  is  selected.  This  is  defined  more 
g  precisely  as  follows, 

Def ini t ion  1  ;  Let  T i ( M )  »  { t i 1, t i 2 , ... , t i k }  be  the  set  of  all 

r.- 

w_.  enabled  immediate  transitions  at  a  marking  M  6  S.  If  k  >  1,  then 
■pi  a  probability  distribution,  called  the  switching  distribution 

PM(tij),  j-l,..,k,  with  PM (tti j )  *  I,  is  defined  on  the 

set  Ti  (M)  such  that  transition  tij  fires  with  probability 
?M(tij).  This  set  of  immediate  transitions  together  with  the 
associated  switching  distribution  is  called  a  random  switch. 


The  reachability  set  S  of  a  GSPN  is  a  subset  of  the 
reachability  set  of  the  associated  PN,  because  precedence  rules 
introduced  with  the  immediate  transitions  do  not  allow  some 
states  to  be  reached.  For  example,  figure  4.1  shows  a  GSPN  where 
{tl,t2}  are  timed  transitions  ,  and  {t3,t4}  are  immediate 
transitions.  Clearly  a  marking  with  two  tokens  in  place  p3 
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Figure  4.1 
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^cannot  be  reached.  Note  also  that  a  switching  distribution  must 
oe  defined  on  transitions  t3  and  t4  since  they  are  simultaneously 
Enabled  when  there  is  a  token  in  p3. 

A  crucial  aspect  of  the  definition  of  a  GSPN  is  the 
‘■^identification  of  all  random  switches  and  the  correct  definition 
rrof  the  switching  distribution.  This  distribution,  however,  cannot 
be  clearly  specified  when  the  set  Ti  contains  independent 
•transitions.  Since  it  represents  possibly  unclear  relation 
/.between  fast  independent  events  in  a  system.  In  the  following  we 
restrict  our  discussion  on  immediate  transitions  to  address  this 
£  issue. 

•.-Definition  2:  a  set  of  transitions  are  said  to  be  independent  if 

r%  ' 

.  * 

‘and  only  if  they  do  not  share  any  input  place.  Let  Pin  *  {  pk  / 
P*(pk,ti)  >  0  }  be  the  set  of  input  places  of  transition  ti  £  T. 

Then  transitions  ti,tj  £  T  are  independent  if  and  only  if, 

•*  ?in(ti)  /I  Pin(tj)  *  0.  Otherwise  they  are  said  to  be  dependent. 

7  For  example  consider  the  portion  of  a  GSPN  shown  in  figure 
4.1  (b) .  Assume  tl  fires  first,  so  that  a  token  moves  to  place 

V 

<Li  ?L,  thus  enabling  the  immediate  transitions  t4  and  t5.  Since 
v  these  transitions  are  dependent,  the  switching  distribution  can 

be  easily  specified  because  it  depends  on  some  local  behaviour  of 

.■v 

£  the  system.  Let  P(t4)  and  P(t5)  be  the  switching  distribution 
defined  on  {t4,t5}.  Similarly,  if  t2  fires  first,  let  P(t6)  and 

£ 

?(t7)  be  the  distribution  defined  on  { t6,t7 }.  Now  if  t3  fires 
^  first,  a  token  is  placed  in  both  pi  and  p2,  thus  enabling  the 
four  immediate  transitions  t4,  t5,  to,  and  t7.  However,  since 

£ 

•'  transitions  in  { 1 4 , 1 5  >  and  {t6,t7}  are  independent  ,  the 
PJ  switching  distribution  on  { t4 , t5 , tS , t7 }  accounts  for  possibly 


unclear  relation  between  two  separate  parts  of  ths  system. 

V 
^  ►' 

It  is  implicitly  assumed  in  the  above  definition  of  a  random 
P  switch  that  no  two  transitions  can  fire  simultaneously  even  if 
they, are  independent.  In  the  sequal  we  relaxe  this  assumption  to 
£  find  a  more  general  definition  of  the  switching  disribution. 

7  definition  3:  a  set  of  enabled  transitions  are  said  to  be 
mutually  exclusive  ,  i.e.,  only  one  of  them  can  fire  ,  if  they 
are  dependent. 

„■  Definition  4:  for  each  set  Tj  *  { tj  1,  t  j  2 , t  j  r }  of  mutually 
«.**  ——————— 

exclusive  immediate  transitions,  de  f  ine  a  s  w  i  tch  ing  probability 
£  distribution  Pd(tjk),  k  *  l,2,..r,  such  that  ^^»i  Pd(tjk)  *  1. 

For  example  figure  4.1  (c)  shows  a  set  of  dependent 

transitions  { tl,  1 2,  t3,  t4 }  with  a  common  input  place  p2.  In  a 

p  aarking  with  a  token  in  both  pi  and  p2r  tl  and  t2  are  mutually 

exclusive  and  a  switching  distribution  must  be  specified  on  the 
subset  {tl,t2}.  Similarly,  for  a  marking  where  pi,  p2,  and  p3  are 
■»  full,  a  switching  distribution  must  be  defined  on  the  set 
{tl, t2 , t3 , t4 }  . 

Now  let  H{M)  be  the  set  of  all  enabled  immediate  transitions 
at  a  marking  M.  H(M)  can  be  partitioned  into  several  subsets  of 
mutually  exclusive  transitions  Qi{M),  i  =  l,..,z,  such  that  for  any 
I-  tu  £  Qi  (M)  and  tv  £  Q  j  (M)  ,  i4j,  tu  and  tv  are  independent.  Let 
Pa^Mtij)  be  the  local  switching  distribution  defined  on  the  set 

*  of  mutually  exclusive  transitions  Qi(M).  Assuming  that 

*  independent  transitions  can  fire  simultaneously,  let  ElV*  = 

{Si,  E2 , . . ,  E  J }  be  the  event  space  at  marking  M ,  where  Ei, 
i*l,...,z,  are  the  events  when  only  one  transition  in  Qi(M) 

<  fires,  Ei,  i *z  +  1, ... ,  z  !/ (2 ! * ( z  —  2 )  l )  are  the  events  when  exactly 


two  transitions  fire  simultaneously,  and  EJ,  J*(2Z-1),  is  the 
ivent  when  z  transitions  fire.  Also  let  P(Ei)  be  the  probability 
if  event  Ei,  i*l,...,J,  such  that  ^-i»i  p(Ei)  *  1.  These 

probabilities  can  be  defined  by  the  system  analyst  if  the 
relations  between  the  parts  of  the  system  represented  by  the  sets 
:f  transitions  Qi(M)  is  clear.  If  this  is  not  possible,  the  above 
events  can  be  assumed  to  be  equally  likely  ,  and  in  this  case 

P  (Ei)  *  1/(2Z-1)  ,  i*l, . .  (2Z-1) 

The  global  switching  distribution  of  the  set  H(M)  can  then 
:e  defined  as  follows, 

PM(til,ti2, . . . ,tik)  =  P  (Ek)  .  PdQ3L(til)  .  PdQ32(ti2) 

....  PdQjk(tik)  , 

k*l - - , (2Z-1) 

Where  PM  ( ti  l,ti2,,..,tik)  is  the  probability  that  the  set  of 
transitions  { ti  1,  ti2,..,  tik}  will  simultaneously  fire,  Ek  is  the 
event  that  k  transitions  one  from  each  Qjs,  s*l,..,k,  will  fire, 
and  PdQ*j(tij)  is  the  probability  that  transition  tij  6  Qij  will 
fire. 

4.3  Modeling  Parallel  Processing  Systems  Using  GSPNs 

Because  of  the  nature  of  parallel  processing  systems,  a  model 
of  such  systems  has  to  handle  such  phenomena  as  contention  for 
multiple  resources,  queueing,  parallel  and  sequential  operations. 
As  demonstrated  earlier,  unlike  petri  nets,  analytical  queueing 
network  models  are  not  adequate  for  modeling  parallel 
operations.  However,  they  are  very  powerful  in  modeling 
contention  for  resources.  In  this  section,  we  demonstrate  by 
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'/Jxamples  that  GSPNs  can  also  be  used  to  model  contention  for 

Multiple  resources. 

.A 

Molloy  [M0L81]  addressed  queuing  issues  in  Stochastic  Petri 
j£;iets  (SPNs).  Where  places  are  viewed  as  queues,  and  transitions 
I  ^  rcdel  arrival  and  departure  events.  For  example  consider  the  SPN 
•'shown  in  figure  4.2  (a).  This  SPN  represents  a  service  center 
aith  a  stream  of  customers  arriving  with  an  exponentially 
distributed  inter-arrival  time  with  a  state  dependent  rate  rl(m), 

•  ■* 

£‘*  and  exponentially  distributed  service  time  with  state  dependent 
sate  r2(m),  where  m  is  the  number  of  customers  in  the  center 

E 

number  of  tokens  in  pi). 

•>;  If  rl(m)  and  r2(m)  are  constants,  i.e.,  they  do  not  depend  on 

£  c,  then  the  SPN  represents  an  M/M/1  queue,  for  an  M/M/k  queue, 
:l(m)  is  again  constant,  and  r2  (m)  is  given  by, 

£■  r2(m)  =  m.r2  ,  0  <  ra  <  k 

3  k.  r2  ,  m  >  k 

Due  to  the  memoryless  property  of  the  exponential 
/  distribution,  and  since  we  have  only  one  class  of  customer,  the 
above  is  adequate  for  both  first-come  first-served  (FCFS)  and 
■>  Last-come  first-served  (LCFS)  queuing  disciplines.  If  the  queuing 
discipline  is  processor  sharing  (PS),  then  r2(m)  will  be  given 


r2 (m)  *  m  .  r 2 
*  k.r2/m 


,0  <  m  <  k 
,  m  >  k 


It  is  not  possible,  however,  to  model  service  centers  with 
sutiple  classes  of  customers  and  fixed  priority  or  FCFS  queuing 
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Figure  4.2  (a)  and  (b) 
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...disciplines  by  SPNs.  Since  in  this  case  we  must  take  into  account 
'  the  order  of  customers  waiting  for  service.  As  shown  in  the 
pjarevious  Chapter, this  class  of  service  centers  is  very  important 
in  models  of  parallel  processing  systems.  In  the  sequel  we  will 
show  that  this  class  of  service  centers  can  be  modeled  by  GSPNs. 

Figure  4.2  (b)  shows  a  GSPN  that  represents  a  service  center 
with  J  classes  of  customers.  A  token  in  pli,  i  =  l,..,J,  represents 
■b  s  class  i  customer  has  arrived  and  is  waiting  for  service.  A 

token  in  o2i,  i*l,...,J,  represents  that  a  type  i  customer  is 

h. 

oeing  served.  And  tokens  in  ps  represent  the  number  of  available 
£ji  servers  or  resources.  Again  the  inter-arrival  (service)  time  of 
..  class  i  customers  is  exponentially  distributed  with  rate  rli 
( r  2  i )  . 

If  The  queuing  discipline  is  assumed  to  be  of  a  fixed  priority, 

^  with  class  1  customers  have  the  highest  priority  and  class  J 
v'-  customers  have  the  lowest  priority.  When  a  resource  becomes 

>  available,  and  if  there  are  customers  waiting,  several  immediate 

v 

transitions  are  simultaneously  enabled.  And  their  switching 

*  » 

* 

distribution  is  defined  such  that  the  transition  that  corresponds 
to  the  highest  priority  class  will  fire  with  probability  one. 

Figure  4.2  (c)  shows  a  GSPN  model  of  a  service  center  with  J 
L“  customer  classes  and  FCFS  queuing  discipline.  The  queue  portion 
of  the  model  is  divided  into  s  stages  where  customers  are  ordered 
according  to  their  arrival.  It  is  assumed  in  this  model  that  the 
maximum  number  of  customers  that  can  exist  in  the  center  is 
ls+k),  where  k  is  the  number  of  available  resources  (number  of 
tokens  in  ps)  . 

The  following  example  demonstrates  the  ability  of  GSPNsto 
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aodel  sequential  and  synchronized  parallel  operation,  and 
contention  for  multiple  resources. 

3xample4.1  :  Consider  a  system  consisting  of  two  service 
centers.  Service  center  one  (SCI)  consists  of  an  infinite  number 
of  servers  (e.g.  IS  queue).  Service  center  two  (SC2)  consists  of 
two  resources  with  FCFS  scheduling  policy.  Jobs  in  the  system 
consist  of  either  a  sequential  task  (labeled  1),  or  a  sequential 
task  followed  by  two  synchronous  parallel  tasks  (labeled  2  and 
3).  The  GSPN  shown  in  figure  4.3  models  the  activities  in  such 
system  when  there  is  only  one  job  in  the  system  at  a  time.  A 
token  in  places  1,2  and  3  indicates  that  a  type  i  task  ;  i=l,2,3 
is  being  serviced  at  SCI.  Also  a  token  in  places  10,  11  and  12 
indicates  that  a  type  i  task  is  being  serviced  at  SC2  and  a  token 
in  places  14,15  and  16  indicates  that  task  i  has  completed.  When 
a  job  is  completed  we  assume  that  a  similar  one  immediatly  enters 
the  system.  The  rates  of  timed  transitions  t^  ;  i  =  l,2,...,6  , 


are  given  by, 
ri  3  ®i/*il 

=  l/sk2 


?  i*l, 2, 3 

;  k»l,2,3  ,  i*4,5,6  respectively 


where  m^  is  the  number  of  tokens  in  place  i,  and  s^j  =  mean 
service  time  of  type  i  task  at  SCj  ;  i«l,2,3  ,  j»l,2. 

Also,  the  branching  probabilities  (of  immediate  transitions)  are 
defined  as  follows, 

pijk  *  probability  that  a  type  i  task  will  visit  SCk  upon 
completing  service  at  SCj  ,  i=l,2,3  ,  j,k»l,2,  and 
?ij0  *  probability  that  a  type  i  task  will  terminate  upon 
visiting  SCj  ;  i *  1 , 2 , 3  ,  j  =  l,2. 


*•1 
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Also,  p*  is  the  probability  that  a  type  1  task,  upon  terminating, 
will  spawn  two  synchronous  tasks. 

Transition  tl6  models  the  synchronization  operation  between 
the  two  parallel  tasks  and  is  enabled  when  both  tasks  have 
completed  service.  When  the  multiprogramming  level  is  greater 
than  one,  i.e.,  there  are  several  jobs  that  are  being  processed 
simultaneously  in  the  system,  synchronization  must  be  acheived 
only  between  parallel  tasks  that  belong  to  the  same  job. 
Therefore  we  must  find  a  way  to  identify  such  tasks.  This  can  be 
done  by  using  colored  tokens.  In  this  case  the  initial  marking  is 
defined  by  the  number  of  tokens  in  pi  and  each  token  is  given  a 
distinct  color.  The  state  or  marking  of  the  colored  GSPN  is 
defined  by  the  number  and  color  of  tokens  in  each  place,  i.e., 
the  current  marking  M  is  given  by, 

M  *  {ml, m2, . ,mn}  ,  and 

mi  *  {mi l,mi2, . .mik} 

where  mij  is  the  number  of  tokens  of  color  j  in  place  p^.  In  this 
case  a  transition  is  enabled  if  there  exists  a  token  of  the  same 
color  in  each  one  of  its  input  places.  And  it  fires  by  removing 
these  tokens  and  depositing  a  token  of  the  same  color  in  each  one 
of  its  output  places.  For  example  transition  tl6  is  enabled  when 
there  is  a  token  of  the  same  color  in  places  15  anf  16,  and  its 
firing  resembles  the  synchronization  of  two  parallel  tasks  that 
belong  to  the  same  job.  Tokens  in  place  13,  however,  are  not 
colored  since  they  represent  available  resources  at  SC2. 
Therefore,  transition  tl3  is  enabled  when  there  is  a  colored 
token  in  place  7,  and  a  token  in  13,  and  fires  by  taking  these 
tokens  and  depositing  a  token  in  10  with  the  same  color  as  the 
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enabling  token  in  7.  And  transition  t4  fires  by  taking  a  token 
from  place  10  and  depositing  one  with  the  same  color  in  17,  and  a 
noncolored  one  in  13. 

It  can  be  easily  seen  that  the  GSPN  model  is  adequate  for 
explicitly  representing  all  key  activities  in  a  parallel 
processing  system.  However,  for  large  scale  systems  with  large 
number  of  service  centers,  and  a  workload  with  a  large  number  of 
different  types  of  tasks,  the  model  becomes  quite  complex.  The 
complexity  of  the  model  arises  not  only  from  the  explicit 
representation  of  the  various  activities  by  places  and 
transitions,  but  also  due  to  the  fact  that  the  number  of  states 
(markings)  will  be  prohibitively  large.  Therefore,  the  analysis 
of  the  model  will  be  very  complex.  In  the  next  section,  an 
approximate  hierarchical  model,  which  is  adequate  for  large  scale 
systems,  is  presented. 

4.4  An  Approximate  Hierarchical  Model 

In  this  section  a  hierachicai  model  will  be  discussed.  It 
employs  both  QNs  and  GSPNs.  QNs  are  used  to  represent  queuing  and 
contention  for  multiple  resources.  And  GSPNs  are  used  to 
represent  concurrent  activities. 

The  model  is  based  on  multiple  time  scale  decomposition.  The 
time  behaviour  of  the  system  can  be  divided  in  two  time  scales,  a 
fast  time  scale,  and  a  slow  time  scale.  The  fast  time  scale 
comprises  activities  describing  the  contention  or  execution  of 
tasks  at  a  certain  resource  in  the  system.  And  the  slow  time 
scale  comprises  activities  describing  the  execution  of  tasks  in 
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che  system  as  a  whole.  The  motivation  behind  the  above  assumption 
comes  from  the  fact  that,  tasks  in  a  system  need  several  CPU-IO 
processing  cycles  before  they  terminate*.  Therefore,  the  soujorn 
time  of  a  task  in  the  system  is  much  larger  than  the  time  needed 
by  that  task  to  execute  at  one  of  the  resources  during  a  cycle. 
4.4.1  Model  Discription 

The  model  consists  of  two  hierachical  levels.  At  the  lower 
level,  the  parallel  processing  system  under  consideration  is 
represented  by  a  multi-chain  closed  queuing  network  (QN).  The 
system  consists  of  a  finite  number  K  of  servers  including 
processors  and  10  devices,  a  particular  pseudoserver  labled  0, 
and  a  finite  number  of  jobs  N  (the  multiprogramming  level).  A  job 
consists  of  several  synchronous  and  asynchronous  tasks.  The 
pseudoserver,  server  0,  is  a  node  in  the  network  defined  to 
accomodate  changes  in  the  number  of  tasks  being  processed  in  the 
system. 

Let  L  denote  the  total  number  of  the  different  types 
(statistically  non-identical)  tasks  of  the  N  jobs.  The  model 
assumes  the  availability  of  the  routing  behaviour  as  well  as  the 
service  time  distributions  of  the  different  types  of  tasks.  Let 
denote  the  probability  that  a  type  i  task  goes  to  server  k 
after  visiting  sever  j,  and  let  S^j  denote  the  mean  service  time 
of  type  i  task  at  server  j.  In  order  to  make  the  model 
computationally  feasible,  we  assume  that  the  QN  have  a  product 
form  solution  for  a  fixed  number  of  the  different  types  of  tasks. 
Therefore,  the  queuing  disciplines  at  each  of  the  K  servers  are 
restricted  to  either,  FCFS,  processor  sharing  (PS),  infinite 


m 


server  (IS)  , 


or  last-come-f irst-served  preemptive  resume 


(LCFSPR).  The  service  times  may  have  arbitrary  distributions, 
except  at  FCFS  servers  in  which  case  the  service  times  of  all 
tasks  are  exponentially  distributed  with  a  common  mean,  i.e.,  S^j 
is  independent  of  i  if  server  j  is  a  FCFS  queue.  These  are  the 
standard  assumptions  for  a  queuing  network  to  have  a  product  form 
solution  [CHA77].  The  QN  can  then  be  analyzed  by  the  fast  and 
simple  Mean  Value  Analysis  (MVA)  algorithm  of  Reiser  and 
Lavenberg  (REI80J.  However,  an  approximate  analysis  of  some  non¬ 
product  form  QN '  s ,  for  example  when  a  FCFS  queue  have 
non-exponential  service  time  distribution,  can  be  obtained  using 
the  extended  MVA  algorithm  developed  by  Bard  [BAR79 ] . 

At  the  higher  level  of  the  hierarchy,  the  behaviour  or 
structure  of  jobs  is  modeled  by  means  of  a  GSPN.  In  the  GSPN, 
tokens  represent  tasks,  places  define  the  type  and  state  of  tasks 
in  the  QN  (e.g.,  a  task  can  be  active,  i.e.,  being  processed,  or 
is  completed),  and  transitions  resemble  the  activities  of 
spawning,  synchronization,  or  execution  of  tasks.  An  enabled 
timed  transition  resembles  that  a  task  is  being  processed  in  the 
system,  and  it  fires  when  the  task  is  completed.  Therefore,  the 
state  (marking)  of  the  GSPN  defines  the  number  of  the  different 
types  of  tasks  competing  for  resources  in  the  QN. 

The  QN  is  solved  for  each  state  M  of  the  GSPN  with  different 
number  of  active  tasks  to  determine  the  firing  rates  of  enabled 
transitions.  The  rate  of  transition  t^  (r^M))  is  the  throughput 
of  type  i  tasks  at  node  0  of  the  QN,  3:  state  m.  Here  we  also 
assume  that  the  times  between  task  arrivals  at  node  3  (task 
completions)  are  exponentially  distributed.  The  correctness  of 


this  assumption  was  proved  by  Courtios  and  Kleison  [COU77,KEIL78 J . 
JvThe  local  performance  parameters  such  as  the  steady  state 
^utilization  Uj(M),  and  mean  queue  length  lj(M)  of  server  j  is 
'  also  evaluated  at  each  state  M. 

The  GSPN  is  then  solved  for  the  steady  state  probability 
TV 

—distribution.  The  global  performance  parameters  are  obtained  from 

,v 

v local  ones  as  follows: 

>  ri<M)  •  P<M)  • 

•k 

...  Uj(M)  *  P(M>  '  Lj*^MfS  1j(M)  •  P(M)  ' 

U  where,  P(M)  (M  S),  is  the  steady  state  probability  distribution 

g  ,and  S  is  the  state  space  or  reachability  set  of  the  GSPN.  Also 
Rj  is  the  throughput  of  type  i  task  at  node  0,  Uj  and  Lj  are  the 

’J 

■’■'global  utilization  and  mean  queue  length  of  server  j 
^  respectively. 

To  further  illustrate  the  above  model,  let  us  consider  the 

\- 

>;  following  two  examples: 
p  Example  4.3:  Asynchronous  tasks. 

Let  us  consider  the  case  where  there  are  N  statistically 
identical  (s.i.)  jobs  in  the  system.  Each  job  consists  of  a 
primary  task  (type  0),  and  one  or  more  s.i.  asynchronous  (type  1) 
tasks  initiated  one  at  a  time  by  the  primary  task,  and  execute 
concurrently  with  it  and  terminate  independently.  Also  consider 
the  central  server  QN  model  shown  in  figure  4.4.  The  model 
consists  of  a  processor  queue,  and  an  10  queue.  The  processor 

%  queue  is  assumed  to  be  a  single  server  queue  with  a  PS  queuing 

£ 

discipline,  and  the  10  queue  is  single  server  queues  with  FCFS 
queuing  discipline.  Figure  4.5  shows  the  SPN  model  of  jobs 
__  behaviour  (with  N*2).  The  state  of  the  SPN  is  Ma(N,k),  where  N  is 
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the  number  of  primary  tasks  (tokens  in  p^),  and  k  is  the  number 
of  spawned  secondary  tasks  (tokens  in  P2)'  The  firing  of 
transition  tg  resembles  the  initiation  of  a  secondary  task,  where 
a  token  is  placed  in  ?2»  The  firing  of  t^  resembles  the 
termination  of  a  secondary  task,  where  a  token  is  taken  out  of 
p2«  The  firing  rates  of  these  transitions  rg(N,k)  and  r^(N,k)  are 
the  throughputs  of  N  type  0  and  k  type  1  tasks  at  node  0  of  the 
QN,  respectively.  The  corresponding  M.C  of  the  SPN  is  shown  in 
figure  4.6.  This  M.C  obeys  local  balance,  i.e.,  at  steady  state, 
the  balance  equations  are, 

r0 (2,k-l) .P(k-l)  ■  r1(2,k).P(k)  k>0, 

and  with  P(k)*l,  the  steady  state  probability  distribution 

is  then. 


POO  *  P  (0)  . 
P(0)  *  1  /  ( 
The  global 

Ri  =  ^*3  c 
Uj  =  ^k  =  3 


x-t 

iBa  r0(2,i)/r1(2,i+l)  k>0,  and 

1+  2Tk^l  rfl(2,i)/rl(2,i+l)  ) 

performance  parameters  are  evaluated 
i(2,k)  .  P (k)  i*0, 1, 


u  j ( 2 , k )  .  P(k),  and  Qj  * 


k*0 


as  follows: 


qj  ( 2  ,  k ) . P ( k ) 


,  j*l, 2, . . . ,n+l 

The  existence  of  the  above  solution  can  be  easily  proved 

since  r0  (2,k)  /r^  (2,k)  <  1  for  all  k>2.  Therefore  the  term 
jX-l 

iVg  rg(2,i)/r1(2,i  +  l)  rapidly  tends  to  zero  as  k  increases,  and 
the  above  infinite  sum  can  be  accuratly  approximated  by  taking  a 
finite  number  of  terms. 

Example  4.4:  Synchronous  tasks. 

Let  us  consider  the  case  where  a  primary  task  subdivides  into 
exactly  two  secondary  (type  1,  and  type  2)  synchronous  concurrent 
tasks.  When  a  secondary  task  completes,  it  must  wait  until  its 
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sibling  has  completed/  at  which  time  the  primary  task  becomes 


>:. 


■-active  again  and  the  process  is  repeated.  The  GSPN  shown  in 

■figure  4.7  resembles  this  behaviour  (N*l).  The  number  of  tokens 

3 

in  p2  and  pj  are  equal  to  the  number  of  active  type  1  and  type  2 

Z* 

iv tasks  respectively.  The  number  of  tokens  in  P4  and  P5  are  the 
7  number  of  completed  tasks  that  are  waiting  for  their  siblings. 

A* 

The  reachability  set  of  the  GSPN  is  shown  in  table  4.1.  And  the 
£  corresponding  MC  is  shown  in  figure  4.8.  As  mentioned  in  the 
s.  previous  section,  if  N  >  1,  the  tokens  are  colored  to  acheive 
"  proper  synchronization  between  the  secondary  tasks.  Therefore, 
ig  each  token  defined  initially  in  pj_  should  have  a  distinct  color. 

Now  consider  the  simple  three  chain  QN  model  shown  in  figure 
’■-*  4.9  Both  the  processor  and  10  queues  are  two  server  queues.  Since 
P  with  N * 1  there  will  be  a  most  two  tasks  in  the  system  at  any 
given  time,  the  queues  are  effectively  IS  (no  contention). 

Let  the  mean  service  time  of  the  above  tasks  be, 

1  sec.  ,  i*fl,l,2.  Also  the  routing  probabilities 


sil  3  si2 


Pi 


10  =  p  ,i*0,l,2.  Since  the  mean  service  times  for  a  task  at 
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k. 


both  queues  are  equal,  the  probability  of  finding  at  least  one 
task  of  a  certain  type  at  any  one  of  the  queues  is  1/2.  Therefore 
the  throughputs  of  tasks  at  node  0  are, 

r0(l)  *  1/2  1/50!  Pgl0=  p/2  =  r i ( j )  ,  i-1,2  ,  and  j-2,3,4. 

The  rate  transition  matrix  of  the  above  M.C.  is 
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Using  the  above  matrix/  the  steady  state  probability 
distribution  of  the  GSPN  can  be  evaluated,  and  the  global 
performance  parameters  can  be  obtained  as  indicated  earlier. 

4.4.2.  Theoretical  Verification 

The  development  of  the  above  hierarchical  model  is  based  on 
the  concept  hierarchical  aggregation  of  continuous-time  discrete- 
state  Markov  processes.  The  theory  presented  here  was  developed 
in  two  recent  papers  by  Coderch  et  al  [  C  0  D  3  3  a  ,  b  ]  ,  which 
oeneralizes  the  earlier  work  on  decomposition  of  Markov  chains 
proposed  by  Courtios  [COU77]. 

Let  (xP(t),  t>*0}  be  a  finite  state  markov  process  (FSMP) 

with  rare  transitions.  The  transition  probability  matrix  of  this 
process  is  PP(t)  *  exp{A(p)t},  where 

A(p)  a  P*  Agi  (3.1) 

is  the  matrix  of  transition  rates,  and  p  [0,po]  is  a  small 
parameter  modeling  rare  transitions  in  xP(t).  This  process  can  be 
considered  to  be  a  perturbed  version  of  the  FSMP  X°(t),  where 
these  rare  transitions  are  neglected  (p=0) . 

3ef inition:  X?(t)  is  said  to  be  regularly  perturbed  if 

supe>.3  ||  PP  tt)  -  P°(t)||  =  3 
otherwise,  the  perturbation  is  singular.  In  which  case 
rank  A(p)  >  rank  Agg,  or  equivalently,  the  number  of  ergcdic 
classes  at  of  X?(t)  is  less  than  that  of  X°(t). 

If  xP(t)  is  a  singularly  perturbed  FSMP,  then 
:)  the  limits 

litn?->0  ?pU/?k)  ■  Pfcit)  ,  k *  1 ,  2 ,  .  .  . ,  m . 

s:e  all  well  defined  and  determine  a  finite 


sequence  of  (in 


general  stochastically  discontinuous)  FSMP's  Xfc(t),  k*l,.., 


m  with 


y. 

k 


£ 


rr 


transition  probability  matrix  P^(t)  (refer  to  section  2  of  the 
next  chapter  for  the  detailed  definition  of  stochastically 
discontinuous  FSMP's); 

ii)  the  limit  processes  X1<(t)  can  be  aggregated  to  produce  a 
hierarchy  of  simplified/  approximate  models  of  xP(t),  each  of 
which  is  a  FSMP  valid  at  a  certain  time  scale  t/p^  describing 
changes  in  xP(t)  at  a  distinct  level  of  detail;  and 

iii)  the  collection  of  the  aggregated  models  X^'ft),  kal,...,m 
can  then  be  combined  to  construct  an  asymptotic  approximation  of 
XP(t)  uniformally  valid  for  t>  =  0. 

The  above  can  be  expressed  by  the  following  theorem: 

Theorem  1:  Let  xP(t)  be  a  singularly  perturbed  stochastically 
continuous  FSMP  taking  values  in  E0*{l,2,..,n0},  with  transition 
probability  matrix  P(t)  =  exp{A(p)t},  and  infinitismal  generator 
A (p)  of  the  form  (3.1),  then 

i)  let  Ak  and  be  respectively  the  infinitismal  generator  and 
the  ergodic  projection  at  zero  of  some  FSMP  X’K(t)  taking  values 


in  E 


0' 


lirnp-»3  suPt  £  [d,T]  J|  P  ( t/pk)  -  Zfc  expUfct}  |  =  0 
for  all  d>3,  and  TCsaand  k  =  l,..,m  (  T  can  be  taken  equal  to  aa  for 
k=m).  Furthermore,  let  Z^a  V^.U^  be  the  canonical  product 
decomposition  of  Z^  (see  proposition  2.1  of  the  next  chapter)  . 


Then, 


ii) 

_ , 

P(t)  *  ex?{A(p)t}  =  exp{A0flt}  +£Jsl  (V^exp { A^p^t } -  Zk)+o(l) 
uniformally  for  t>*3,  where  A^'  *  U^A^V^  is  the  generator  of  a 
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FSMP  X^'  (t)  taking  values  in  E^* { l,..,nfc }  and 

r» 

2^30  rank  A^ 

2k  are  evaluated  for  k  =  0,l,2,  as 

follows : 

Ag'  *  Ag  *  Agg  > 

Al'  *  U1  A1  V1  3  °1  A01  V1  • 

a2 '  *  «2  a2  V2  3  u2  <A02  "  A01  A*  A01>  V2  » 
where  A^  =  (Agg  +  Z^)”1  ”  zl  ' 

and  Z ^  *  limt-*oo  exP^Ag  tl  ,  or -is  the  solution  of 

Ag  21  a  3  ,  with  Z]_  1+  ■  1+  r  where  1+T  *  [1  1  .  .  .1] 

Z2  3  limt-*oo  Z1  exp(Ait 1  3  Vi  (limt->00  exp{Ax  t}  )  UL  , 
or  can  be  evaluated  from 
A1*  Z1t  *  0  ,  with  Zi'  1+  3  1+  and  then 
z2  -  vL  Zi'  Ui 

For  the  expressions  for  higher  order  models  refer  to 
[C0D83b ] .  Also  Delebeque  (DEL83J  developed  a  recursive  algorithm 
to  compute  the  above  aggregated  models  A^'. 

Example  4.5: 

To  illustrate  the  above  theory,  consider  the  simple  FSMP  X?(t) 

,  the  state  rate  transition  diagram  of  which  is  shown  in  figure 
4.10.  With  the  transition  rate  p  between  states  2  and  3  is  much 
less  than  one,  the  process  will  spend  a  random  amount  of  time 
switching  between  states  1  and  2  and  eventually  it  will  get 
trapped  in  3.  It  is  clear  that  we  can  identify  phenomena  occuring 
at  two  time  scales.  At  the  "fast”  time  scale  only  transitions 
between  1  and  2  occur  and  X°(t)  is  a  good  model  for  that.  At  the 
"slow"  time  scale  (t/p)  a  sample  function  of  the  process  in  the 
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licit  p->0  is  shown  in  figure  4.11,  which  is  denoted  by  X^(t).  It 
.s  clear  that  this  function  has  an  infinite  number  of 
^continuities  on  a  finite  time  interval.  The  distribution  of 
::.e  random  variable  describing  the  length  of  this  in* arval 
converges  to  the  exponential  distribution  [KEI78].  This  process 
:sn  then  be  approximately  aggregated  (figure  4.12).  The  rate 
'.ransition  matrix  of  the  aggregated  model  is  obtained  as  follows: 

Let  the  rate  transition  matrix  of  xP(t)  be  expressed  as. 


*Mp)  3  Agg  +  P  Aqi  ,  Agg3 
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3  0  0 
0-11 
0  0  0 


-1  1  0  * 

1-10, 

0  0  3 


and 


then  solving  for  the  steady  state 


probability  matrix  2^  of  the  process  X°(t), 
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:ren  using  the  above  expressions. 


r-i/2 


V  -  ui  A0i  vi  *  0 

;.lso  notice  that  rank  Agg 
-■  theorm  3.1, 


1/2 
0  . 

+  rank  A^' 


rank  A(p),  and  therefore 


exp{ A (p)  t)  *  expUggt}  +  exp{A]_  p  t}  U]_  -  Z],  +  o(l) 
for  small  t  expiA^  p  t}  3  I  ,  exp{A(p)t}  3  exp{Aggt}  ,  and 
:::  Urge  t  exp{A0gt}  3  exp{A(p)t)  3  VL  expiAj.pt}  Uj. 

To  show  the  application  of  the  above  theory  to  the 
*.:srarchical  model  described  in  previous  section,  let  us  consider 
sxanple  4.4  of  that  section.  And  let  us  analyze  the  exact 
solution  of  the  QM  in  figure  4.9  under  the  specified  job 
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behaviour  using  the  Markov  chain  model.  And  assuming  that  the 
service  times  distributions  for  all  of  the  three  types  of  tasks 
in  the  network  are  exponential.  The  M.C.  model  of  the  QN  in 
figure  4.9  is  shown  in  figure  4.13.  The  state  is  characterized  by 
a  2X3  matrix  with  raws  representing  queues  and  columns 
representing  the  task  types. 

For  example  at  state  1,  a  type  0  task  is  being  serviced  at  queue 
1  (the  processors  queue),  while  at  state  2  it  is  being  serviced 
at  the  10  queue.  The  transition  rate  from  state  1  to  state  2  is 
equal  to  l/SgL  .  (1-Pgig)  *  1  -  p  ,  where  Sgl*l  and  Pglg  *  p 

The  limit  as  p--  0  can  be  interpreted  as  that  the  average 
number  of  cycles  that  a  task  makes  through  the  processor  queue 

0 

and  the  10  queue  tends  to  infinity.  If  the  tasks  are  large,  i.e., 

they  need  several  cycles  of  CPU  10  procesing  (p<<l)  before  they 

terminate  (branch  to  node  3),  the  above  M.C.  can  be  analyzed 

using  the  theory  described  above.  Let  the  rate  transition  matrix 

of  this  of  this  M.C.  be  expressed  as 

'A”  3  0  0 

0  B "  3  0 

A(p)  »  Agg  *■  p  Agi  ,  Where  Agg  *  0  0  C"  0 

3  0  0  0 " 


■-2  1  1  0 
1-201 
10-21 
0  11-2 


,  and 


’0  -1  10033000" 

0  000000000 
0  30-1-101010 

0  3300-13001 

Agi-  0  0303-13100 

0  000000000 
1  000000-100 
0  000000000 
1  30003000  -1 

0  000000000 


It  can  be  easily  seen  that  the  M.C,  is  decomposed  into  fout 


ergodic  classes  at  zero  (i.e.,  at  p«3)  ,  with  transitions 
represented  in  the  figure  by  solid  lines  as  follows: 

Ei»{l,2}  ,  E2“{3,4r  5,6}  ,  E3»{7,8}  ,  and  E4*{9,13}. 

These  classes  partition  the  M.C.  into  four  chains  with  rate 
transition  matrices  given  by  A",  B" ,  C",  and  D"  respectively.  In 
chain  1  the  type  0  task  is  being  processed  in  the  network  in 
either  the  processor  queue  or  the  10  queue,  in  chain  2  both  type 
1  and  type  2  tasks  are  being  processed,  in  chain  3  the  type  2 
task  only  is  being  processed  (type  1  has  terminated  first),  and 
in  chain  4  the  type  1  task  is  being  processed  (type  2  has 
terminated  first).  The  steady  state  probability  matrix  Z^  can  be 
evaluated  by  solving  each  one  of  these  chains  separatly  for  the 
steady  state  probability  matrices  Z^'  ,  i*l,2,3,4,  which  are 
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corresponds  to  the  G5PN  model  in  figure  4.7.  Therefore,  the  G3?N 
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models  the  system  from  a  higher  Level  of  detail  defining  the 
number  of  the  different  types  of  active  tasks  in  the  system.  And 
the  corresponding  M.C.  model  of  the  GSPN  is  the  aggregated  model 
that  describes  the  behaviour  of  the  process  at  the  slow  time 
scale  (the  time  required  to  process  a  whole  task).  While  at  the 
fast  time  scale  (the  time  required  to  process  a  portion  of  the 
task  at  one  of  the  servers,  i.e.,  during  one  cycle  in  the  system) 
the  QN  models  the  behaviour  of  the  process  at  a  lower  level  of 
detail,  i.e.,  the  types  of  servers  required  and  the  mean  service 
time  at  each  server  per  visit. 

The  above  example  demonstrates  that  the  approximate 
hierarchical  model  proposed  in  the  previous  section  is  based  on 
the  approximate  hierarchical  aggregation  of  the  exact  M.C.  model. 
And  the  basic  assumption  is  that  jobs  consists  of  large  tasks 
,i.e.,  tasks  that  require  multiple  accesses  to  many  different 
system  resources  before  they  terminate. 

4.4.3  Validation  Examples 

In  this  section,  the  accuracy  of  the  above  model  is  validated 
by  comparison  to  discrete  event  simulation.  Two  examples  of 
systems  with  asynchronous  tasks  and  synchronous-asynchronous 
concurrent  tasks  are  considered. 

In  the  first  example,  the  system  described  in  example  4.3 
,with  jobs  consisting  of  asynchronous  tasks,  is  considered.  This 
system  was  simulated  by  a  simulation  program  writen  in  SIMSCRI?? 
II. 5,  The  analytical  results,  obtained  from  the  hierarchical 
model  as  described  in  example  4.3,  were  found  to  be  very  accurate 
compared  to  the  simulation  results  for  a  variety  of  models.  Table 


i*  »  **.  •*.  •'  . W  >  1  *  *  .  *  . 


4.2  shows  the  parametec  settings  foe  13  different  models  (figure 

4.4).  These  parameters  were  chosen  such  that  the  10  server 
(server  2)  or  the  CPU  server  (server  1)/  or  both  are  heavily 

utiltized.  This  corresponds  to  cases  1-4,  cases  6-9,  and  cases  5 

and  10,  respectively.  The  multiprogramming  level  N  is  equal  to  5 

in  all  cases.  For  simplicity,  The  service  time  distributions  of 

both  tasks  at  the  10  server  are  assumed  to  be  exponential  with 

the  same  mean  S2  so  that  the  underlying  QN  will  have  a  product 

form  solution.  The  following  parameters  are  common  to  all  cases, 

**11  51  3.00001  ,  and  S2  *  0.0002 


Model  number 

p010 

pll0 

S01 

1 

0.1 

0.1 

0.0001 

2 

0.3 

0.1 

0.0001 

3 

in 

• 

ea 

0.1 

0.0001 

4 

0.1 

0.3 

0.0001 

5 

0.1 

0.3 

0.0001 

6 

0.1 

3.1 

0.001 

7 

0.1 

0.3 

0.001 

8 

0.1 

0.5 

0.001 

9 

0.3 

0.1 

0.001 

10 

0.5 

0.1 

0.001 

Table  4.2  Parameter  settings  for  central  server  models. 
Table  4.3  shows  the  results  obtained  for  the  throughputs  of 


tasks  at  the  CPU,  and  servers  utilization.  The  simulation  results 
are  shown  between  parenthesis  followed  by  the  relative  percentage 
error  between  the  analytical  and  simulation  results. 
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MODEL  NO. 

THROUGHPUTS  AT 

CPU  OF  TASK 

UTILIZATION 

TYPE  0 

1 

CPU 

10 

1 

2777 

2777 

.31 

.999 

(2783) , .21 

(2780) , .1 

(.31) 

(1.00) 

2 

1470 

4417 

.19 

1.0 

(14  7  5)  ,.3  3 

(4413) ,.04 

(.21)  ,1.8 

(1.0) 

3 

1003 

4998 

.15 

1.0 

(1007) , .29 

(5007) , .17 

(.17) ,11 

(1.0) 

4 

4379 

1459 

.45 

.99 

(4399) ,  .45 

(1460)  ,  .06 

(.44) ,2.2 

(.99) 

5 

4926 

985 

.49 

.98 

(4951) ,.5 

(988) , .35 

(.50)  ,2 

(.99) 

6 

989.7 

989.7 

1.0 

.36 

(998.5) , .88 

(981.5) ,  .33 

(.99) ,1 

(.36) 

7 

996 

332.2 

.99 

.23 

(1003) ,  .7 

(332.1) ,.03 

(.99) 

(.23) 

3 

997 

199.6 

1.0 

.20 

(1002)  , .  5 

(200.1) , .2 

(.97)  ,3 

(.20) 

9 

969.7 

2909 

1.0 

.66 

(981.7) ,1.2 

(2912) , .1 

(.99)  ,1 

(.66) 

10 

935.9 

4679 

.98 

.94 

(951.9) ,1.6 

(4734) ,1.1 

(.93) 

(.94) 

Table  4.3.  Cocnpacison  between  analytical  and  simulation  results. 
Notice  that  the  analytical  results  are  still  very  accurate 
even  when  one  of  the  servers  is  saturated  (190%  utilization). 
This  was  not  the  case  in  the  model  developed  in  (HSD  82]  which  is 


mainly  suitable  for  balanced  systems. 
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Figure  4.15  shows  the  throughput  of  primary  tasks  as  a 
function  of  the  mutl iprcgramming  level  N  from  the  analytical  and 


simulation  models.  And  figure  4.16  shows  the  the  CPU  utilization 
as  a  function  of  M.  The  parameters  for  these  figures  are  as 
follows , 

Sgi  *  .0001,  a  .00001,  Sj  a  .3001,  Pg^g  a  .35,  and  p]^g  a  *4 

Example  4.7:  Consider  the  QN  in  figure  4.17  with  a  CPU  queue  at 
node  1,  and  an  10  queue  at  node  2.  Let  us  assume  again  for 
simplicity  that  the  CPU  queue  is  a  single  server  queue  with 
processor  sharing  queueing  discipline,  and  exponentially 
distributed  service  times  with  mean  for  type  i  tasks.  The  10 

queue  is  also  a  single  server  queue  with  FCFS  queueing 
discipline,  and  exponentially  distributed  service  times  with 
common  mean  for  all  types  of  tasks,  i.e.,  Si2  is  independent  of 
i.  This  QN  will  have  a  product  form  solution  for  a  specific 
number  of  tasks. 

Jobs  behaviour  is  modeled  by  the  GSPN  in  figure  4.18  , 
where  a  primary  task  may  with  probability  P(2)  subdivide  into  two 
synchronous  (type  1  and  type  2)  tasks,  or  with  probability  (1- 
P ( 2 ) )  spawn  an  asynchronous,  task  (type  3),  which  executes 
concurrently  with  it  and  terminates  independently.  The  markovian 
model  of  this  GSPN  is  complicated  to  evaluate.  But  using  the 
concept  of  hierarchical  decomposition  on  the  GSPN  the  above 
system  can  be  easily  solved.  If  we  assume  that  the  initiation  of 
synchronous  tasks  is  more  frequent  than  asynchronous  ones  (?(2)  > 
0.5),  and  that  asynchronous  tasks  take  longer  time  to  execute 
than  synchronous  ones.  Then,  while  a  certain  number  of 
asynchronous  tasks  are  executing  in  the  system,  several 
synchronous  ones  will  start  and  terminate  for  some  time  enough  to 


pi 


Figure  4.13 


let  the  subnetwork  that  models  their  activities  reache  local 
equilibrium.  The  GSPN  can  then  be  decomposed  to  the  GSPN  (Nl)  in 
figure  4.7  of  example  4.4  at  the  lower  level,  and  the  SPN  (N2)  in 
figure  4.5  of  example  4.3  at  the  higher  level  of  the  hierarchy. 

In  figure  4.19,  let  r01  and  r02  be  the  rates  of  transition  t0 
for  Nl  and  N2  respectively.  Then,  at  each  state  k  of  N2  which  is 
defined  by  the  number  of  tokens  in  p7  (i.e,  for  a  certain  number 
of  asynchronous  tasks  in  the  system),  Nl  is  to  be  solved  for  the 
local  steady  state  probability  distribution  Pj_(M',k) ,  M'  S]_, 
where  is  the  reachability  set  of  Nl  when  there  are  N  tokens 
initially  in  pi  (figure  4.19  (a)).  And  r01(M’,k)  *  P  (2)  ,r0  (M'  ,k) . 
Also  the  local  performance  parameters  such  as  throughputs, 
utilizations,  and  mean  queue  lengths  are  to  be  evaluated.  Then  N2 
can  be  solved  as  mentioned  in  example  4.3,  with 

r02(k)  s  U-P(2  ))  2?  £  S1  r0(M’,k)  P1(M,,k), 

and  r6<k>  =  *2Lm  '  €  SI  *6(M\k)  P  i  ( M ' ,  k ) . 

Where  r0(M',k)  and  Cg(M',k)  are  the  throughputs  of  type  0  and 
type  3  tasks  respectively  at  node  0  of  the  QN  at  state  M*(M',k). 
And  the  global  performance  parameters  can  be  evaluated  from  local 
ones  as  mentioned  before. 

The  above  was  implemented  for  a  set  of  ten  central  server 
models.  Table  4.4  shows  the  different  parameters  of  the  models. 
This  set  contains  cases  with  only  moderate  utilizations  at  the 
devices  as  well  as  heavily  CPU  and  /  or  10  bound  cases.  The 
following  parameters  are  common  to  all  cases; 

N  *2,  Png®  1  ~  P i 1 2 =  '  P 210=  1  "  P 212 =  3-9  ' 

&2lQa  1  _  P312*  ®  ^  *  ^01a  0.01  ,  and  3.3303 


fi Model  number 

P(2) 

p010 

Sll 

S  21 

S2 

•  i 

0.9 

0.5 

0.5 

0.5 

0.1 

!v 

2 

0.7 

0.5 

0.5 

0.5 

0.1 

rr 

,  «* 

3 

0.5 

0.5 

0.5 

0.5 

0.1 

4 

0.9 

0.1 

1 

0.5 

0.04 

5 

0.5 

0.1 

1 

0.5 

0.04 

6 

0.3 

0.1 

1 

0.5 

0.04 

7 

0.9 

0.1 

0.01 

0.05 

0.008 

3 

0.7 

0.1 

0.01 

0.05 

0.008 

9 

0.5 

0.1 

0.01 

0.05 

0.008 

13 

0.3 

0.1 

0.01 

0.05 

0.008 

Table  4.4  Paramerter  settings  for  central  server  models 

Notice  that  the  synchronous  tasks  are.  CPU  bound  and  the 
asynchronous  tasks  are  10  bound  tasks. 

Each  of  the  above  models  was  simulated  by  a  simulation 
program  written  in  SIMSCRIPT  II. 5.  And  each  simulation  was  run 
for  several  minutes  on  an  I3M  3033. 

Table  4.5  shows  the  percent  relative  errors  (  100 
percent  times  the  absolute  value  of  simulation  estimate  minus 
approximate  analytical  value  divided  by  simulation  estimate  )  for 
the  performance  parameters  of  each  model. 
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MODEL 

MO. 

THROUGHPUTS  AT 
TYPE  3  1 

CPU  OP 
2 

TASK 

3 

UTILIZATION  MEAN 
CPU  10 

QUEUE  LENGTH 
AT  CPU 

1 

0.1 

0.3 

0.0 

2.2 

0.5 

0.0 

4 

2 

0.6 

1.6 

1.6 

0.5 

1.9 

0.1 

0.5 

3 

14 

14 

15 

15 

15 

16 

20 

4 

2.2 

1.8 

1 

0.1 

2.2 

3.8 

7.2 

5 

4.6 

4.3 

4.3 

6.3 

5.2 

2 

0.1 

6 

15 

17 

17 

1? 

19 

15 

10 

7 

5.1 

5.2 

4.9 

6.9 

4.7 

5.3 

6.1 

3 

3.1 

1.4 

2.1 

2.1 

2.7 

2.8 

2.5 

9 

0.0 

1.1 

0.4 

3.8 

0.6 

1.3 

2 . 5 

13 

4.8 

6.3 

6.3 

6.3 

5.3 

* 

5.6 

10 

Table 

4.5.  Percent 

relative 

errors 

Large  errors  were  found  in  models  with  high'  10 


utilisation  (around  93%),  such  as  models  3  and  6.  3est  results 
were  found  in  the  CPU  bounded  models,  as  expected.  The  above 
decomposition  of  GSPNs  will  be  investigated  further  in  Chapters  6 


CHAPTER  5 
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ANALYSIS  CF  THE  GENERALIZED  STOCHASTIC  PETRI  NETS 
BY  STATE  AGGREGATION 


& 


& 
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5.1  Introduction 

In  this  chapter,  the  analysis  of  GSPNs  will  be  considered. 
The  existing  methods  of  analysis  will  be  briefly  described  with 
:heir  advantages  and  limitations.  A  different  and  more  general 
rethod  of  analysis  will  then  be  presented. 

As  mentioned  before  there  are  two  types  of  transitions  in 
GSPNs,  immediate,  and  timed.'  Once  enabled,  immediate  transitions 
fire  in  zero  time,  while  timed  ones  fire  in  an  exponentially 
distributed  random  time.  Several  transitions  may  be  enabled  by  a 
narking.  If  the  set  of  enabled  transitions  H  comprises  only  timed 
transitions  with  rates  r ^  (i£H),  then  the  enabled  timed 

transition  t^  fires  with  probability 

ri/^k$H  rk  (5.1.1) 

If  H  comprises  several  timed  transitions  and  one  immediate 
transition,  then  this  is  the  one  that  fires  with  probability  one. 
If  H  comprises  several  immediate  transitions,  it  is  necessary  to 
specify  a  probability  distribution  on  the  set  of  enabled 
immediate  transitions  according  to  which  the  firing  transition  is 
selected.  The  subset  of  H  comprising  all  enabled  immediate 
transitions  together  with  the  associated  probability  distribution 
:s  called  a  random  switch,  and  the  associated  distribution  is 
called  the  switching  distribution. 


Assuming  that  the  reachability  set  3  is  finite,  and  firinc 

r 

rates  of  timed  transitions  do  not  depend  on  the  time  parameter 
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..(however  they  may  be  marking  or  state  dependent),  Marsan  et  al 

& 

(MAR84 1  have  recognized  that  the  time  behaviour  of  a  GSPN  is 
^equivalent  to  a  stationary  (homogenous),  finite  state,  continuous 
.time  stochastic  point  process  (SPP).  And  that  a  one  to  one 
•v  correspondence  exist  between  GSPN  markings  and  the  SPP  states. 
£  The  sample  functions  of  the  SPP  may  present  "multiple 
discontinuities"  due  to  the  sequential  firing  of  one  or  more 
•»■  immediate  transitions.  The  process  is  observed  to  spend  a  non- 
^  negative  amount  of  time  in  markings  enabling  timed  transitions 
only,  while  it  transits  in  zero  time  through  markings  enabling 

■  i 

E  immediate  transitions.  It  is  called  tangible  a  state  (or  a 
L-:  marking)  of  the  former  type  and  vanishing  a  state  (or  a  marking) 

b 

of  the  latter  type. 

If  Therefore,  the  state  space  of  the  GSPN  is  divided  into,  a  set 
of  tangible  states,  and  a  set  of  vanishing  states.  Furthermore, 
by  assuming  that  the  GSPN  is  irreducible,  i.e.,  each  element  of 
!•;  the  set  of  all  possible  markings  S  is  reachable  with  a  non-zero 
v  probability  from  any  other  state  of  the  set  (no  marking,  or  a 
”*  group  of  markings  exists  that  absorbs  the  process),  they  proposed 
two  solution  methods  for  evaluating  the  steady  state  probability 
distributions  of  the  GSPN. 

The  first  method,  which  is  a  simple  extension  of  the  one 


1  proposed  by  molloy  [M0L81J,  assumes  that  all  immediate 


e 


transitions  are  replaced  by  timed  transitions  characterized  by 
very  high  firing  rates  oropotional  to  an  arbitrary  value  x.  Under 
this  assumption  all  states  are  tangible,  and  the  GSPN  reduces  to 
a  standard  SPN,  which  can  be  analyzed  by  solving  the 
corresponding  M.C..  If  an  explicit  solution  expression  for  the 
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probability  distribution  of  this  SPN  is  obtained,  the  steady 
states  probability  distribution  of  the  original  GS  PN  can  be 
obtained  by  taking  the  limit  for  x  going  to  infinity  of  such 
solution.  However,  since  most  practical  cases  involve  GSPNs  with 
a  large  state  space,  an  explicit  expression  of  the  solution  in 
terms  of  x  is  usually  not  easy  to  obtain,  and  the  practical 
approach  that  can  be  suggested  of  numerically  solving  the  problem 
by  assuming  x  to  be  very  large  and  setting  to  zero  those 
probabilities  that  appear  exceptionally  small,  is  confronted  by 
numerical  problems.  Moreover,  the  above  method  not  only  requires 
useless  computations  of  the  probabilities  of  vanishing  states, 
but  it  also  increases  the  computational  complexity  by  enlarging 
the  size  of  the  rate  transition  matrix. 

The  second  method  proposed  in  [MAR34 ]  eliminates  some  of  the 
disadvantages  of  the  above  method,  by  computing  the  total 
transition  probabilities  among  tangible  states  only.  The  method 
is  described  as  follows: 

Let  S  =  state  space  of  the  SPP,  (S/3ks 

T  3  set  of  tangible  states  in  SPP,  |T/*kt 

V  *  set  of  vanishing  states  in  SPP,  jV|*kv 

with  5  3  T  U  V,  T  0  V  *  0 ,  and  ks  3  kt  +  kv. 

Disregarding  for  the  time  being  the  concept  of  time,  and 
focusing  attention  on  the  set  of  states  in  which  the  process  is 
led  because  of  a  transition  out  of  a  given  state,  it  is  observed 
that  a  stationary  embeded  narkov  chain  (SMC)  can  be  recognized 
within  the  SPP.  The  transition  probability  of  this  SMC  can  be 
written  as  follows: 


-  *  -  r 


0  -  A  +  B  * 


r 

) 

+ 

ij  : 


a  1  kv 


ft'  The  elements  of  matrix  A,  which  represent  the  probability 
v 

that  the  process  will  go  to  a  vanishing  state  (C)  or  to  a 

•v  tangible  state  (D)  given  that  it  is  at  a  vanishing  state,  can  be 

obtained  using  the  switching  disributions  of  random  switches.  And 

it 

the  elements  of  B,  which  represent  the  transition  probabilities 
v  given  that  the  process  is  in  a  tangible  state,  can  be  obtained 
;/  using  the  firing  rates  of  timed  transitions  as  in  relation  5.1.1. 

The  transition  probability  matrix  Q  ■  [ q ^ j  3  between  tangible 
ft;  states  only  can  be  computed  as  follows: 

|H  ^ij  “  ^ij  +  €  V  ®  i  r  ^  t  j  1  ,  irjfT,  rfV  (5.1.2) 

where  f^j  is  the  transition  probability  from  tangible  state  i  to 

tangible  state  j,  e^E  is  the  transition  probability  from  i  to  a 
■  vanishing  state  r,  and  Pr(r-« >j]  represents  the  probability  that 
the  SPP  moves  from  the  state  r  to  the  state  j  in  an  arbitrary 

* 

number  of  steps  following  a  path  through  vanishing  states  only. 

The  probabilities  of  reaching  tangible  states  in  exactly  k  steps 

of  vanishing  states  starting  from  a  vanishing  st3te  are  given  by 

s'5  --2^*0  c"o  (5.i.:> 

The  i rreduc ib i 1 i ty  property  of  the  SPP  insures  that  the 

spectral  radium  of  the  matrix  C  is  less  than  one.  This  implies 

K  h 

that  the  limit  of  the  sum  lim^.^  00  C  exists,  and  is 


finite.  Equation  (1.2)  in  the  matrix  form  becomes 

Q  »  F  ♦  E  G00 


(5.1.4) 


1  P  co  f  *-h»0  0  '  where  C  h  >  *o 

where,  G00*  A 

(I  -  C)"1  ,  since  this  equals  to  Cn  D 


where, 


v.'-rw,  v  v  v 
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|^hich  corresponds  to  cases  where  there  exist  no  loops  among 
vanishing  states,  and  cases  where  such  loops  exist,  respectively, 
ft  The  solution  of  the  system  of  linear  equations  Y  *  Y.Q,  can 

Tp’oa  interpreted  in  terms  of  the  number  of  transitions  performed  by 

•  ■*’ 

the  EMC  observing  that 

^  1/yi  *  E {  number  of  transitions  performed  by  the  EMC  to  return 

to  state  i  } 

Selecting  state  i  as  a  reference  state, 
g*Let  V^j  *  Yj/yi  *  Et  number  of  visits  to  state  j  between  two 
p  subsequent visi ts  to  state  i} 

The  computation  of  steady  state  probability  distribution  of 
H the  SPP  can  be  obtained  reintroducing  the  concept  of  time  by 
t aeans  of  the  average  sojourn  time  in  each  state  (STi  ,  i6T)  as 

*>  ~ 

follows: 

"  Let  Hi  *  sat  of  timed  transitions  enabled  at  state  i 

then  STi  *  l/^STj^Hi  rk  •  is  the  av®Ea9®  sojourn  time  for  state 


The  amount  of  time  spent  by  the  SPP  to  return  to  state  i  is 
*  ^STjeT  vij  STi  t  where  is  considered  to  be  the  mean 

i*  amount  of  time  spent  by  the  SPP  in  state  j  during  a  cycle.  The 
average  fraction  of  time  spent  by  the  SPP  in  each  of  its  states 

V. 

to 

is  given  by 

t  Pj  -  Vij  STi  /  wt  ,  j  e  S 

Which  is  the  steady  state  probability  distribution  of  the  SPP. 

The  advantage  of  the  above  method  over  the  first  method 
?  is  that  it  reduces  the  impact  of  the  site  of  the  set  of  vanishing 


states  on  the  complexity  of  the  solution  from  0(ks**3)  in  the 


first  method  to  0(kt**3)  +  0(kv**3),  where  ks  »  kt  ♦  kv. 

Appart  from  the  fact  that  SPP  must  be  irreducible,  the  above 
aethod,  however,  have  a  serious  limitation.  It  implicitly  assumes 
that  the  steady  state  probabilities  of  all  markings  that  enable 
immediate  transitions  are  zero.  This  limitation  will  be 
demonstrated  by  the  following  example. 

Example  S.l  : 

Consider  the  GSPN  in  figure  5.1,  tj_  and  t2  ate  timed 
transitions  with  rates  r^  and  r2  respectively.  The  rest  of  the 
transitions  are  immediate  transitions.  The  reachability  set  S 
with  one  token  in  the  network  is 

1  10  0  0 

2  0  10  0 

3  0  0  1  0 

4  0  0  0  1 

Solving  for  the  steady  state  probability  distribution  of  the 
above  states  using  the  first  method,  where  we  assume  that  the 


firing  rates  of  all  immediate  transitions  is  x,  the  rate 


transition  matrix  of  the  corresponding  M.C.  is, 


The  S.S.  probabilities  P  »  [PI  P2  P3  P4]  are  obtained  by 
solving 

P  A  *  0  ,  with  Pi  ■  1 

Let  d  *  2  r^/x  +  2(l+ri/r2)  +  rir2/x 
then, 

PI  »  (l+rj/xj/d  ,  P2  ■  1/d  ,  P3  *  (l+r2/x)  r]/(r2  d)  , 

and  P4  *  r^  /(r2  d)  . 

In  the  limit  as  x— *oo,  we  have 

Pi  ■  P2  *  1/2  .  r2/(ri+r2)  ,  and  P3  *  P4  *  1/2.  r^/(r^+r2) 

Using  the  second  solution  method,  although  the  GSPN  and  the 
corresponding  SPP  is  irreducible,  it  is  treated  by  this  method  as 
if  it  is  reducible  to  two  ergodic  classes:  states  1  and  2  as  the 
first,  and  states  3  and  4  as  the  second  class.  Therefore,  the 
method  is  not  applicable  in  this  case,  and  in  any  GSPN  where 
ergodic  instantaneous  markings  exist. 

To  demonstrate  the  importance  of  the  above  class  of  GSPNs, 
consider  the  matrix  A  in  the  above  example.  If  every  element  in  A 
is  divided  by  x,  then 


A  *  A (p)  »  A0  +  p  Al,  where  p  ■  1/x, 


-1 

1 

0 

0 

*0 

0 

0 

0 

1 

-1 

0 

0 

0 

-rl 

rl 

0 

A0  * 

0 

0 

-1 

1  ,  and  Al  * 

0 

0 

0 

0 

0 

0 

1 

-1 

r2 

0 

0 

-r2 

V. 

The  steady  state  probability  distribution  of  the  GSPN  can  be 
£•  obtained  from  those  of  the  above  M.C.  be  letting  p  0.  Clearly 
the  above  M.C.  is  singularly  perturbed  since  rank  A (p)  >  rank 


A (0) .  Therefore,  a  GSPN  is  equivalent,  in  the  sense  of  steady 

J*  state  probability  distributions  to  a  perturbed  SPN  with  rare 

\ 


ry. 
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fs  transitions  modeled  by  a  small  parameter  p  in  the  limit  when 

s 

p  — >  g.  And  the  class  of  GSPNs,  equivalent  at  the  limit  to 
ji  singularly  perturbed  SPNs,  have  an  important  role  in  the 
t.  hierarchical  aggregation  of  the  later.  The  hierarchical 
aggregation  of  SPNs  will  be  described  in  Chapter  7. 

PS 

£■  In  the  next  and  subsequent  sections  we  will  introduce  a  more 

general  solution  method  that  will  alleviate  the  computational  and 

»*• 

“  numerical  disadvantages  of  the  first  method  by  eliminating  the 

»  * 

'r  set  of  vanishing  states,  and  generalize  the  second  method.  The 
method  that  will  be  discussed  is  based  on  charactr izing  the  GSPN 
®  time  behaviour  by  a  stochastically  discontinuous  finite  state 
>:  Markov  process,  which  is  a  special  SPP.  The  properties  of  this 

orocess  will  be  discussed  in  detail  in  the  following  section. 

i 

5.2  Stochastically  Discontinuous  Finite  State  Markov  Process 
The  stochastically  discontinuous,  continuous- time,  finite 
state  markov  process  is  a  process  (x(t),  t  >  0}  that  may  undergo 
\-  an  infinite  number  of  transitions  in  finite  time  intervals.  Such 
processes  violate  the  continuity  condition 
limt.^  j  Pr { x ( t)  ■  X(0)  }  *  1 

They  were  first  analyzed  in  [D0042,  and  DFN65],  but  were 
considered  pathological  from  an  application  viewpoint,  and  since 

»* 

then  stochastic  continuity  has  been  a  standard  assumption  in  the 
y  literature.  Coderch  (COD83b]  has  recognized  that  stochastically 
discontinuous  processes  are  obtained  as  limits  of  markov 

£ 

processes  with  transition  rates  of  different  orders  of  magnitude, 
and  that  the  stochastic  discontinuity  property  has  a  natural  and 


important  interpretation  in  this  context. 

Stochastically  discontinuous  FSMP’s  (SDMP)are 


^characterized  as  follows: 

Let  { X (t) ,  t  >  0}  be  a  FSMP  taking  values  in  a  finite  state 
‘‘-space  E  *  {el,e2,. ..,an}.  This  process  is  completely  described  by 
‘-•'its  transition  probability  matrix  P(t)  whose  elements  are 

Pii(t)  *  Pr  {X  (t)  *  j  /  X(0)  -  i} ,  i ,  j  *  E,  t  >  0. 

y*  ^ 

''and  satisfies  the  following  conditions: 


-Ii)  P{0)  *  I  ,  ii)  P(t)  >  0  ,  iii)  P(t)  .  1+*  1+  ,  and 

iv)  P(t)  P(s)  ■  P(t+s)  t,s  >  0,  1+T-[1  1  . 1]. 

|y  It  is  known  that  ?(t)  is  continuous  for  t  >  0,  and  the  limit 
^-limt_^  g  P(t)  *  2  always  exists.  If  Z  is  the  identity  matrix  then 

E 

the  process  x(t)  is  called  stochastically  continuous,  otherwise 

r-. 

I';  it  is  stochastically  discontinuous  with  the  following  transition 
^probability  matrix: 

r~  Theorem  2.1:  If  P(t)  is  the  transition  probability  matrix  of  a 
jkSDMP  then. 


P(t)  *  Z  exp{A  t}  t  >  0 

'‘■‘for  a  pair  of  matrices  Z,  A  satisfying: 


r 

v  * 

\ 


(5.2.1) 


L  i)  Z  >  0  ,  Z.l+*1\  Z‘ 


ii)  Z . A  *  A. Z 


A; 


|P  iii)  A. l+al+>  ;  iv)  A  +  c  Z  >.  0  for  some  c  >  0. 

J  Conversely,  any  matrices  A,  Z  satisfying  the  (i)-(iv) 
-'.•uniquely  determine  a  FSMP  with  transition  probability  matrix 
y  given  by  (5.2.1). 

*  Proof:  [COD 8 3b]  . 

jv* 

^  The  matrix  z  »  linit-^  g  P(t)  is  referred  to  as  the  ergcdic 


•.'projection  at  zero,  and  the  matrix  A  *  limh_^g  (P(h)-Z)/h  is 


V' 

Sy  sailed  the  infinitismal  generator  of  ?(t). 


The  diagonal  entries  of  the  matrix  Z  classify  the  states  of 
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the  process  as  follows: 

Definition!:  A  state  i  is  called  instantaneous  if  z^  <  1,  and 
regular  if  *  1.  An  instantaneous  state  j  is  called  evanescent 

if  Zjj  »  0. 

It  was  proved  that: 

1)  the  sojourn  time  in  an  instantaneous  state  is  zero  with 
probability  one  (w.p.l),  and  in  a  regular  state  i  is 
exponentially  distributed  with  rate  a^j.  (diagonal  entries  in  A)  . 

2)  Sven  though  the  duration  of  stays  in  a  given  instantaneous 
state  is  zero  w.p.l,  there  is,  in  general,  a  non-zero  probability 
of  finding  the  process  in  an  instantaneous  state  at  any  given 
time.  However,  the  probability  of  finding  the  process  in  an 
evanescent  state  at  any  given  time  is  zero.  The  evanescent  states 
can  thus  be  negelected  in  the  sense  that  there  exists  a  version 
of  the  process  X(t)  with  the  same  finite  dimensional 
distributions  which  does  not  take  values  in  the  set  of  evanescent 
states . 


* 

i 


I 


3)  Z  is  the  matrix  of  ergodic  probabilities  of  a  raarfcov  chain 
and  as  such  it  determines  a  partition  of  the  state  space  E  in 
terms  of  ergodic  classes  E^  ,  i*l,...,s,  and  transient  states  ET, 

i.e.,  E  -  (U^  Et)  U  Et  (5.2.2) 

this  is  referred  to  as  the  ergodic  partition  at  zero.  Each 
ergodic  class  S*  consists  of  either  one  element  (  a  regular 
state),  or  several  elements  (instantaneous  states).  The  set  of 
transient  states  E«j  characterizes  the  evanescent  states. 

The  evolution  of  a  SDMP  can  be  thought  of  as  follows:  While 
in  a  regular  state  it  behaves  as  a  stochastically  continuous 
process.  Upon  entering  a  state  belonging  to  an  ergodic  class  at 


jl 


> 

1 


WWl'mnn 


l*LTT  .^JCT 


tj 

^  zero  with  mote  than  one  state/  say,  Ek ,  the  process  starts 
JR  switching  instantanuously  among  the  states  in  Ek.  The  amount  of 
•time  spent  in  Ek  is  exponentially  distributed,  and  after  a  random 

ft 

fv  stay  m  Ek  the  process  jumps  to  some  state  in  E-Ek.  Evanescent 
'-states  may  be  visited  during  transitions  between  the  ergodic 
classes  at  zero. 

m  The  probabilistic  properties  of  a  SOMP  are  derived  from  its 
v  ergodic  projection  at  zero  plus  an  aggregated  version  of  the 
process  that  is  stochastically  continuous.  This  can  be 

*  , 

j*  demonstrated  as  follows: 

proposition  2.1:  Let  2  be  the  ergodic  projection  at  zero  of  a 

% 

SDMP,  then  by  adequate  ordering  of  states. 


,v 
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:11  0  ' 
0  Z22 


0 

0 


2l,s+l 


<ss 


‘s  ,s+l 


a 

0 


(5.2.3) 


with  Zk;<  *  l  +  .wlcT  ,  k»l,...s,  for  some  vector  wk  >  3  such  that 
wkT.l+  *  1;  and  »  dk*wk*  ,  k*l,...s  for  a  set  of  vectors 

«k  ^  such  that  -2^,^  dk  »  1*. 

Furthermore,  define  the  (nxs)  matrix  V  and  the  (sxn)  matrix  U 
as  follows: 
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Then, 


(5.2.4) 


v  .  o  *  2  ,  a.v-i 


(5.2.5) 


Proof:  Follows  from  the  fact  that  2  is  the  matrix  of  ergodic 
probabilities  of  a  markov  chain.  The  vector  w^  is  the  vector  of 
steady  state  probabilities  of  a  chain  with  state  space  and 
steady  state  transition  matrix  2^.  The  vectors  d^  are  the 
trapping  probabilities  from  transient  states  to  the  ergodic 
classes  (00053]. 

The  structure  of  (5.2.3)  makes  explicit  the  ergodic  partition 
at  zero.  (2.4)  is  called  the  canonical  product  decomposition  of 
2.  Also  0  and  V  satisfy  the  following 

0  .  1*  *  1+,  V  .  1+  *  1+  ,  o  .  2  -  0  ,  and  2  .  V  -  7 
Theorem  2.2:  Let  ?(t)  »  2  exp{A  t }  be  the  transition  probability 
aatrix  of  a  SDMP  X(t)  taking  values  in  2*{el/s2,...,en}  and  let  s 
be  the  number  of  ergodic  classes  at  zero.  Let  2  *  V  .  U  be  the 
canonical  product  decomposition  of  2,  then 

P’(t)  -  0  P(t)  V  -  exp{U  A  V  t)  ,  for  all  t  >  3  (5.2.6) 
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fe 

t  »■ 

4*  1 

8 


r 


is  the  ttansition  probability  matrix  o£  a  stochastically 
continuous  FSMP  taking  values  in  E'*{el' ,e2' , . . . ,es' } ,  and 

P ( t)  -  V  P'(t)  0  for  all  t  >  2  (5.2.7) 

?roof :  [COD83b] 

Equation  (2.6)  can  be  interpreted  as  performing  an 
aggregation  operation  that  masks  the  stochastically  discontinuous 
nature  of  P(t).  Also  equation  (2.7)  can  be  interpreted  as 
follows : 

Pr{X(t)»ei  /  X (0) *ej } *  wu  .  PrlX'^-el*  /  X'(0)-ep'} 

•  ® j  4 Ep  '  ®*  ^  E1 

where  w^  is  the  component  of  the  steady  state  probability  vector. 
Wj.  corresponding  to  ei.  That  is,  the  transitions  between  the 
ergodic  classes  Ej  are  governed  by  the  aggregated  process,  while 
once  in  one  of  the  classes  E^ ,  the  probabilities  wj_  are 
immediatly  established  due  to  the  ins  tan taneous  nature  of 
transitions. 

It  should  be  noted  here  that  the  above  aggregation  is  exact, 
i.e.,  there  is  no  approximation  involved  whatsoever,  whereas  the 
aggregation  described  in  the  previous  chapter  was  approximate  due 
to  the  fact  that  the  transitions  were  not  quite  instantaneous. 
Corollary  2.1:  The  rate  transition  matrix  A'  of  the  aggregated 
process  X'(t),  which  is  the  inf  ini  tismal  generator  of  ?'(t),  is 
given  by, 

A'  »  CJ  Aj.  V  (5.2.8) 

where  A^  is  the  matrix  of  transition  rates  of  the  process  X(t) 
when  all  instantaneous  transitions  have  been  removed. 

Proof:  Follows  from  theorem  2.2  above,  and  the  theory  of 
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singularly  perturbed  FSMP's  presented  in  the  previous  chapter. 

Example  5.2:  Consider  the  GSPH  in  figure  1.1  of  the  previous 

section.  The  graph  of  the  GSPN  represents  the  transition  diagram 

of  a  SOM?  with  state  space  E» [el,e2,e3,e4 }  represented  as  follows 

el  10  0  0 
e2  0100 
e3  0010 
e4  0001 

Clearly  the  ergodic  partition  at  2ero  is 
E]_*{el,e2}  ,  and  E2*{e3,e4} 


and  2 


since  w- 


,  where  Zii-Z22a 

i 

[1/2  1/2] 


1/2 

1/2 

1/2 

1/2 

f  and  CJ  * 


1  0  1/2  1/2 
then  V  *  0  1  ,  and  CJ  * 

J  1 J  J  0  0 

The  cate  transition  matrix  A'  is  given  by 


then  A 


CJ  Ai_  V  ,  where  Aj_  * 

^  * 
-1/2  rL  1/2  rL 

a 

1/2  r2  -1/2  r2 


Solving  the  aggregated  process  for  the  steady  state  probabilities 
of  the  ergodic  classes,  we  have 

P(E1)  *  r2  /  ( r i^r 2 )  ,  P(E2)  *  rj.  /  (ri+r2) 

and  the  steady  state  probabilities  of  the  SDMP  are  evaluated  by 


?  (el) 

a 

P(e2) 

therefore , 


W1  •  ?<si)  /  and 


P  (e3 ) 
P  (e4 ) 


w2  .  P(S2) 


*  ** 


•i^FUi  I  J.HJI.IUWJS.P^.mj.n  III.IILII jjUnjtm.iDimM, 


w  w  !M  Jiumuni  i*j« 
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P(el)*P(a2)*  1/2  C2/(rj,+r2)  /  and  P(e3)*P(e4)«  1/2 

which  is  the  same  result  obtained  before  using  the  first  method 
in  section  1. 


3.3  Evaluation  of  the  GSPN  steady  state  probability  distribution 
In  this  section  a  solution  method  for  the  steady  state 
probability  distribution  of  the  GSPN  will  be  presented.  This 
method  is  based  on  characterizing  the  time  behaviour  of  a  GSPN  as 


3  SDMP. 

4 

Theorem  3.1:  The  marking  sequence  of  a  live  and  k-bounded  GSPN 
forms  a  SDMP. 

proof:  Let  S  »  {Ml,M2,...,Mn}  be  a  reachability  set  of  a  live 
and  bounded  GSPN  with  an  initial  marking  Ml.  Since  the  GSPN  is 
live,  then  there  exist  no  marking  in  S  at  which  all  transitions 
are  disabled. 

Let  { X (t) ,  t>*0}  be  a  stochastic  process  with  a  finite  state 
space  E  *  {l,2,...,n},  such  that 


1-  Thera  exist  a  one  to  one  mapping  F  between  the  elements 
of  E  and  the  elements  of  S,  i.e.,  for  each  Mi  £  S, 

there  exist  a  corresponding  state  i  £  S,  such  that  F(Mi)  *  i  , 

2-  For  each  Mi,Mj  £  S,  where  Mj  is  reachable  from  Mi  by  the 
firing  of  a  transition  enabled  by  Mi,  there  exists  a  transition 
in  X ( t)  from  the  corresponf ing  state  i  to  j,  and 

3-  For  all  i  £  S,  the  soujorn  time  of  i  is  equal  to  that 
of  Mi ,  i.e, 

?r[  X(w)  *  i,  w  £  [0, t] /  X (3 )  ■  i]  *  exp{-yi  t} 


9 


i 


Jvhere  yi  ■  ^-j»l 


and  rj  is  the  rate  of  transition  tj 


•nabled  at  Mi,  k>*l.  If  any  one  of  these  transitions  is  an 


immediate  transition,  the  above  probability  will  be  zero,  since 
phe  rate  of  such  a  transition  is  infinite.  Therefore,  the  soujorn 


time  of  state  i  is  either  zero  if  Mi  enables  any  immediate 
transition  ,  or  exponentially  distributed  if  Mi  enables  only 
timed  transitions. 

The  state  space  S  of  the  above  process  can  be  partitioned 
into  a  set  of  instantaneous  states,  and  a  set  of  regular  states 


.with  exponentially  distributed  soujorn  time.  Clearly,  if  all 


states  are  regular,  then  X ( t)  is  a  finite  state  stochastically 
^[continuous  Markov  process.  From  the  theory  described  in  the 

h  ■ 

previous  section,  the  existence  of  instantaneous  states  results 


K'  in  a  Markov  process.  The  transition  probability  matrix  of  which 
|y?(t)  is  discontinuous  at  t*0,  i.e.,  from  theorem  2.1 


P(t)  *  2  exp  {A  t},  t  >  0,  P(0)  *  I  ,  and  2  »  limt__„0P(t) 

If  all  states  are  regular,  then  for  each  state  i  ,  zii  *  1,  and 
[|j  2*1.  If  there  exist  an  instantaneous  state  i,  then  zii  <  1,  and 


=5 

II 


the  process  is  stochastically  discontinuous. 


The  above  theorm  establishes  the  fact  that  the  steady  state 
probability  distribution  of  the  GSPN  markings  can  be  obtained  by 


sj  solving  the  corresponding  SDMP.  As  in  the  previous  section,  we 


'Xv  need  to  obtain  the  matrices  a,  V,  and  A’,  which  fully 


characterize  the  time  behaviour  of  the  process.  These  matrices 


ire  obtained  as  follows: 

v'J1  From  the  reachability  graph  analysis  [MAT  30,  FLO  34,35,  CHI 

sy. 

Vy 

ggiff  '51  of  the  GSPN  under  immediate  transitions  only,  the 


.•■Vi 

>v. 


?■•[•.  tachabiiity  sat  S  can  be  oartitioned  into  two  subsets  S,  and  S-i. 
*  ,  *  A  *» 
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jSthe  subset  S 5  contains  markings  which  enable  any  immediate 

W!  * 

transition,  and  markings  reachable  by  the  firing  of  an  immediate 

9 

-’transition.  The  subset  contains  all  other  markings. 
Furthermore,  S2  is  partitioned  into  several  subsets  S2i, 

>' 

i*l,..,g,  and  a  subset  S2T.  This  partition  corresponds  to  the 

;v  , 

state  space  partition  into  ergodic  classes  at  zero  expressed  in 


equation  (2.2)  as  follows: 

■ 

E  *  S  *  S-,  0  So 


(5.3.1) 


where,  *  ui^ii  ®i*  »  S2  *  &i*l  S2i)  U  S2T  ,  and  k  +■  g  *  s 

v  which  is  the  total  number  of  ergodic  classes  at  zero.  Each  S2i 

t 

consists  of  one  ergodic  class  which  may  contain  one  marking  that 
absorb  the  process  under  immediate  transitions,  or  several 
£  aarkings  reachable  from  one  another  by  immediate  transitions. 
Sach  ei'  consists  of  one  marking.  And  the  set  S2T  consists  of 
transient  markings. 

I,  The  reason  for  the  above  modification  of  the  partition  of 
ergodic  classes  at  zero  is  to  allow  a  more  straightforward 
construction  of  the  matrices  a  and  V,  which  can  then  be 


partitioned  as  follows: 


SI  I 


SI  S  2 


S2  2 


,  and 


(5.3.2) 


v'  where  I;<  is  an  identity  matrix  of  dimension  k,  K’  is  an  (n-k)xg 
^  aatrix,  and  X’*  is  a  gx(n-k)  matrix  given  by 


S 21  S22  .  .  S2g  S2T 


S21 

’  1* 

5iT 

d 

S22 

1* 

2 

• 

• 

,  K"  » 

S2g 

i+ 

WgT  2 

S2T 

i — 

a 

d2  . 

- 

where  d^  is  a  vector  of  trapping  probabilities  from  transient 
markings  to  the  ergodie  class  in  S2i.  And  w^  as  before  is  the 
steady  state  probability  vector  of  a  markov  chain  with  state 
space  S2i. 

The  vectors  of  trapping  probabilities  can  easily  be  obtained 
as  follows:  consider  an  absorbing  MC  with  a  state  space  defined 
by  the  union  of  the  set  S2T,  and  a  set  of  g  absorbing  states  each 
of  which  corresponds  to  an  ergodie  class  S2i,  i*l,..,g.  The 
transition  probability  matrix  PT  of  this  MC  is  given  by 


?T  * 


S2T 


3 


X 


Where  Ig  is  an  identity  matrix  of  dimension  g,  y^j  is  the 
transition  probability  to  any  state  in  ergodie  class  j  from  a 
state  i  in  S2T,  and  x^j  is  the  transition  probability  to  state  j 
in  S2T  from  state  i  in  S2T.  The  vectors  of  trapping  probabilities 
are  then  computed  as  follows  [XSM  63], 

(dl  d2  .  .  dg]  -  (I-X)-L  .  Y  (5.3.3) 

To  obtain  the  aggregated  matrix  A',  consider  now  the  GSPN 
under  timed  transitions  only.  The  matrix  A2 ,  which  is  the  matrix 
of  transition  rates  between  markings  that  enables  timed 


cansitions,  is  also  partitioned  as  follows: 


,  where  A"  is  a  kxk  matrix  with  off-. 


a"  a- 

C"  O'* 

•  * 

diagonal  elements  representing  transition  rates  between  marking 
that  belong  to  the  set  Bw  is  an  kx(n-k)  matrix  with  elements 
representing  the  transition  rates  from  markings  that  belong  to 
to  markings  that  belong  to  S2,  end  such  that  the  diagonal 
elements  of  A"  is  the  negative  of  the  sum  of  the  off-diagonal 
elements  of  each  row  in  AM  plus  the  elements  in  the 
corresponding  row  in  B".  Also  D"  is  an  (n-k)x(n-k)  matrix  with 
off-diagonal  elements  representing  transition  rates  between 
markings  that  belong  to  S2,  and  C"  is  an  (n-k)xk  matrix  of 
transition  cates  from  markings  in  S2  back  to  markings  in  S^,  such 
that  the  diagonal  elements  of  D"  is  the  sum  of  the  off-diagonal 
elements  of  D"  and  the  elements  of  C"  for  each  raw. 


Using 

equation 

(2.8)  , 

the  sxs 

matrix  A ' 

is  obtained 

as 

follows : 

A" 

UJ 

a 

99 

A'  * 

U  Ai  V  » 

K"C" 

(3.4) 

Which  can 

then  be  solved 

for  the  steady 

state 

probabilities 

of 

che  ergodic  classes  at 

zero  ?  (e  i')  , 

i  *  1/ . 

..k  ,  and  P  (S  2i  )  , 

i*l,...,g.  The  steady  state  probability  distribution  of  the  GSPN 
markings  can  then  be  evaluated  from, 

?(Msi)  *  ?(ei)  ,  i«l, . ,k  ,  where  Msi  are  markings  that 

belong  to  S^,  and 

‘pcmiM 


POlj1) 


*  P ( S 2 i }  for  the  j  markings  that  belong  to 

S2i  ,  i*l, . ,g  . 


5.4  Examples 


rrii 


■  ■  u  Pta  ? 


i 
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L)  Consider  the  simple  GSPM  shown  in  figure  4.7,  with  one  token 
initially  in  p^,  the  reachability  set  is 
Ml  1  0  0  0  0 

M2  3  116  0 

M3  30113 
M4  01301 
MS  00011 

Under  immediate  transition  tj,  we  have  M5  — >  Ml,  therefore 
Si  *  {M2 ,M3 ,M4 }  *  {el,e2,e3}  ,  S21  *  {Ml},  and 
S2T  *  {M5 } . 


Then 
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I 

1 

0 

0 

0 

• 

— 

0 

1 

0 

0 

1 

0 

0 

0 

0 

V  « 

0 

0 

1 

0 

,  U  - 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

J 

0 

0 

1 

0 

0 

0 

1 

0 

r 

1 

m 

% 

J 

•  A  */• 

.  ■»  4  •  7  •  ;  C\ 


K" 


w- 


[10] 


also  A" 


(rl+r2) 

rl 

r  2 

'0 

« 

0 

0 

-r2 

0 

,  9 w  * 

0 

r2 

0 

0 

-rl 

0 

» 

rl 

-r0  0* 

’  r0  0  3 

and  0"  » 

,  C"  - 

0  0 
u  - 

b  0  0  0 

and  using  (3.4)  the  aggregated  transition  matrix  is  given  by 


.V  * 


(rl+r2) 

rl 

r2 

0 

0 

-r2 

0 

r2 

0 

0 

-rl 

rl 

r0 

0 

0 

-r0 

•-N 


■7"S 


i 


S 


2)  Consider  the  GSPN  in  figure  5.3.  tg  and  tg  are  immediate 
transitions  that  form  a  random  switch,  and  fire  with  probability 
P (5)  and  P(6),  respectively.  The  reachability  set  S  is 
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Ml 
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0 
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M2 

0 
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0 
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M3 
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M4 
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0 
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M5 
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0 

M6 
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0 
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1 

0 

M7 
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0 

0 

0 

0 

0 

1 

kv 


Considering  immediate  transitions  only  we  have 

Ml 


M6 


•>  M7 


M2 


■>  M3 


We  can  clearly  distinguish  two  ergodic  states  Ml  and  M3. 
Therefore, 

S 21  »  {M3}  ,  S22  *  {Ml},  and  S2T  *  {M2,M6,M7} 
where  M6,  M7,  and  M2  are  evanescent  states.  To  obtain  the 

trapping  probability  vectors  from  these  states  to  S21  and  S22,  we 
have , 


PT 


M3 

1 

0 

1  g 

I 

0 

0 

Ml 

0 

1 

1  3 

0 

0 

—  - 

_  —  _ 

MW  _ 
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0 

1 

M7 

0 

m 

P(6) 

!?(5) 

0 

0 

*.  *k»  A  A*.  •. 


.-J.  ■  s  j  ’J  -J W  r.  i  « 


,and  from  equation  (3.3)  (dl  d2  ] 


1  3 

PCS)  P(6) 
P  (5 )  P (6) 


M3 

1 

3 

• 

3 

— 

Ml 

3 

1 

13  3 

3 

K' 

*  M2 

1 

3 

,  and  K"  * 

3  13 

3 

31 

M6  1 

PCS) 

P(6) 

M7  : 

■ 

P(5) 

PCS)  ^ 

how 

considering 

timed 

trans i tions , 

we  have 

M4 
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M5 
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then  3"K 


K"D  '*  K  1 


P (5) r2 
P  (5)  rl 
(rl+r2) 
r  0 


P (6) r2 
P  (6)  rl 

-E0 


K”CM 


rl 

0 


r2 
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,  therefore,  we  have 
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rl 


3 

■r  1 
r  2 
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P(5) r2 
P (5) rl 
- (rl+r2) 
r3 


?(6)r2 
P (6) rl 
3 

-r3 


,  and 
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3)  Consider  the  GSPN  in  figure  5.4,  transitions  1 2  and  tj  can  be 

the  orocab i  1  i ties  ?(2)  and 


s® 


simultaneously  enabled.  Therefore, 


Figure  5 
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P(3)  ace  assigned  Co  determine  wich  one  will  fire  ficst. 
reachability  set  is  given  by 
Ml  100003300 
M2  011000000 
M3  001100000 
M4  010010000 
MS  000110000 
M6  000001100 
M7  000000110 
M3  000001001 
M9  000000011 


The 


Again  considering  immediate  transitions  only,  we  have 

'  '•v 

Ml  — >  M2 


M9 


MS 


■>  Ml 


^M3-" 


Therefore,  there  is  one  ergodic  class  given  by 

S21  *  {Ml, M2, M3, M4, MS } ,  where  M9  is  a  transient 

state.  Solving  the  above  markov  chain,  which  has  a  transition 


probability  matrix  P  given  by 

p 


Ml 

M2 

M3 

M4 

M5 


1 

0 

0 

0 

0 


0 

P<2) 

0 

0 

0 


0 

P(3) 

0 

0 

0 


0 

0 

1 

1 

0 


for  the  steady  state  probabilities  we  get, 
wi4  * 


[1/4  1/4  P{2)/4  ?{3)/4  1/4  ] 


[111111]  ,  and  K" 


WWTTT TTfTM  •!  «gpn mnm»WIW  W  wimn 


Then,  k’t  -  [1  1  1  1  1  1]  '  and  K"  *  W1 
Considering  timed  transitions,  we  get. 
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M6 

-  (r6+r7) 

r6 
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3 
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3 

3 

0 

■  Ml 

3 

-r7 

3 

,  BM  * 

3 

3 

3 

0 
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r7 
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3 

3 

-r6 

3 

W 
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3 

3 

3 

r6 

* 

• 

Ml 

* 

r5 

3 

3 

-r5 

3 

3 

3 

3 

3 

M2 

3 

3 

3 

3 

3 

3 

3 

3 

0 

*  M3 

3 

3 

3 

,  D"  * 

3 

3 

3 

3 

3 

3 

M4 

3 

3 

3 

3 

3 

3 

3 

3 

3 

MS 

3 

3 

3 

3 

3 

3 

3 

3 

3 

M9 

3 

3 

3 

3 

3 

3 

3 

0 

3 

a  , 

• 

- 

Then, 

B"K  ’  * 


3 

r7 

c6 


K”Crt  »  [1/4. r5  3  0]  ,  and  X"D"X’  *  [-1/4. r5] 


Therefore , 


f-(r6+r7)  r6 
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CHAPTER  6 


TECHNIQUES  FOR  REDUCING  ANALYSIS  COMPLEXITY 


^•6.1  Introduction 

& 

The  method  described  in  the  previous  chapter  is  valid  for  the 
^analysis  of  general  GSPNs.  However/  the  analysis  can  be  quite 

*  complicated  for  GSPNs  with  large  state  spaces.  For  example, 

£ 

“consider  the  GSPN  in  example  1  of  section  5.4  in  the  previous 
Chapter,  with  one  token  initially  in  pi,  there  were  5  feasible 
states  for  the  network.  If,  however,  we  added  k  tokens  in  pi,  the 
"  number  of  states  will  be  in  the  order  of  5*.  Therefore  even  with 
;>  such  a  simple  GSPN,  an  explosion  of  the  state  space  can  make  the 
analysis  very  complicated. 

I 

In  this  chapter,  rather  than  describing  the  stochastic 
>  behaviour  of  a  GSPN  by  a  SDMP,  which  is  then  analyzed  from  its 
projection  at  zero  plus  an  aggregated  version  of  it  represented 
by  the  rate  transition  matrix  A1,  we  attempt  to  do  such 
£  aggregation  or  reduction  directly  at  the  GSPN  level. 

The  analysis  in  this  chapter  will  be  restricted  to  a  class  of 
GSPNs  that  inherits  the  structure  of  restricted  PNs.  Such  PNs 
£  will  be  defined  in  the  following  section,  and  some  of  its 
important  properties  will  be  developed. 


6.2  Restricted  Petri  Nets 


Definition  1:  A  Petri  Net  PN  *  (?,T,I,0),  with  an  initial  marking 

,  ** 

Ml  and  a  reachability  set  S,  is  called  a  restricted  PN  if  all 
5  arcs  have  a  weight  of  one,  i.e.,  the  input  and  output  functions 


r  / 


134 


I 

I  „ace  such  chat,  I:  PXT  — «■»  £3,1}  ,  and  0:  TX?  —  ►  {3,1},  and  foe 
\m 

'sany  transicicn  in  T  the  set  of  input  places  and  the  sec  of  output 

((places  are  disjoint  (self-loop-free). 

[  We  will  also  assume  that  the  PN  is  live  and  bounded,  i.a., 

•'-for  e/ery  ti  $  T  and  for  all  M k  £  S  there  exists  a  transition 

^firing  sequence  starting  at  Mk  and  ending  at  a  marking  that 

!v 

enables  ti.  An  important  property  of  restricted  PMs  is 

k  «r 

a^es  tablished  by  the  following  theorem. 

Theorem  1 :  ( superpos  i  tion  theorem) 

For  any  restricted  PM,  let  Sl,S2,..,5k  be  reachability  sets 
|g  obtained  from  the  different  initial  markings  iM  1 1,  M 1 2, .. ,  M  lk , 

respectively.  Then  for  an  initial  marking  Ml'  *  Mll+Ml2+...+Mlk, 

►v 

which  gives  a  reachability  set  S', 

P  if  Mr'  *  Mjl1+Mj22+. . .  +M  j k*  (5.2.1) 

, where  M j i 1  £  5i  ,  i*l,..,k 
then  Mr'  £  S',  i.e..  S'  3  (sl*s2*. . .^sk) . 

Moreover,  the  above  condition  becomes  necessary  and 
sufficient  if  for  every  initial  marking  Mli,  i*l,...k,  the  PM  is 
u  live. 

To  prove  the  above  theccm,  we  need  to  introduce  the  following 
def  in i t ions . 


t  Definition  2:  for  a  restricted  PN  *  (?,T,I,C), 

*>!  where  ?  *  {pi ,p2 ,  . . ,  ?n} ,  T  *  { 1 1 , 1 2, .. ,  ta } , 

I:?XT->  {3,1}  ,  and  0:TX?  -*•  {3,1},  with  an 
^  initial  marking  Mi  and  reachability  set  3,  then  for  any  Mk,Mk-if 
3,  such  that  Mk  is  immediatly  reachable  from  Mk-1  bv  the  firinc 


of  transition  tj. 


Mk  *  Mk-i  +  Dt  CJj 


(6.2.2) 


'  «r.  »r  r:v  v 


".k 


where  Mk  and  Mk-1  are  nxl  column  vectors,  Oj  is  an  mxl  column 
■vector  with  exactly  one  nonzero  entry  in  the  position 

5 

corresponding  to  transition  tj,  and  D  is  an  mxn  matrix  called 

ft 

ft  transition  to  place  incidence  matrix  defined  as  follows: 

D  *  -  D~t  +  D+,  where  0*  *  {dij“}  *  {I(pi,tj)}, 
and  0+  »  (dij*}  »  (0(ti,pj)}. 

>  Therefore,  the  entries  of  matrix  D?  dij,  are  1,  -1,  or  0  if 
transition  ti  has  an  outgoing  arc  to  place  p j ,  an  incoming  arc 
from  place  p j ,  or  no  arc  between  them,  respectively, 
jg  Equation  6.2.2  is  the  matrix  form  of  equation  2.2.1  in 
Chapter  2  [MUR77,  PET81 ] .  For  example,  for  Ml  and  M2  in  exmaple  1 

>.* 

of  section  5.4, 

I 

r  1  T  1  tl  t3  t4  r 

0  1  plf-l  0  0  1  1 

l  0  p2  l  -l  0  0  0 

1*0  +  p3  1  0  -1  0  0 

0  0  p4  0  1  0  -1  0 

T  0  0  p5  001  -1  0 

The  above  can  be  extended  to  a  sequence  of  transition  firings 

~  as  follows: 

Definition  3:  for  any  PN  with  an  initial  marking  Ml,  incidence 
matrix  D,  and  reachability  set  S,  if  Mk£S,  then 
i'  Mk  *  Ml  +  Dt  U1#k  (6.2.3) 

where  Ui  is  an  mxl  column  vector  called  the  firing  vector,  the 
ith  element  of  which  is  the  number  of  times  transition  ti  fires 
l.  in  the  transition  firing  sequence  (tjl,tj2,..,tjk)  starting  from 
Ml  and  ending  at  Mk,  i.e., 

ul,k  *  ^i*l 


1  •*.*  ‘.V.  V.V  >'.vV  ' 
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Vtoofi  of  theorem  1: 

>* 

Lac  Me'  *  Mjl1  -r  ..  +  Mjk'*  , 

m 

"fee  any  Mji1  t  Si,  i«l,..,k. 

By  6.2.3,  Mji1  *  Mil  +  Dt  UL  n  ,  i»l,..,k. 

£  JL  „ 


.hen  Mr'  *  5-  Mil  -<■  DA  a  ,  where  L  *  ^Tsi  U1  ii 
/*/  *  J 
v 

»  Ml'  +  0‘  U 

Since  Ml'  is  the  initial  marking  for  the  reachability  set  S', 
"Ind  all  the  transition  sequences  in  U  are  defined  in  Ml',  then 
again  by  6.2.3,  Mr'  is  reachable  from  Ml',  i.e.  Mr'  6  S'. 

For  the  second  part  of  the  theorem,  the  above  proves 
Sufficiency,  the  proof  of  necessity  is  done  by  contradiction  as 
■^follows: 

Let  Mr ' 6  S',  and  Mr'  -/=  Mjl1  +  -  +  Mjk*,  for  any  Mji1  £  Si, 

®then,  Mr'  -Ml'  *  0T  U  »  Mil  -  M12  +  ..  +  Mlk  +  D?  U, 

*  i 

.•and  since  Mr'  */■  Mji1,  then 

K 

there  exists  no  Ul,  j  i ,  such  that  C  *  fJl,  j  i  that  is  defined 

%  ^ 

;Voy  a  sequence  of  transition  firing  from  Mli,  then 

1C 

Mr'  *  •£-i  =  l  Mrai1  *  D1*  CJ  *  ,  where  Mmi1  £  Si, 

and  LI 1  is  a  firing  vector  of  transitions  not  enabled  by  any 

vjMmi1,  i*l,...,k.  Since  the  firing  vector  U  is  defined  for  Ml', 

\then  there  exist  at  least  two  markings,  from  the  above  sec  o 

markings,  that  can  be  added  together  to  enable  a  transition  in 

JHCJ'.  Therefore,  there  exist  at  least  one  initial  marking  from  the 
*  ■ 

.set  { M 1  !,M 1 2, ... , M Ik }  for  which  the  ?N  is  not  live  which  is  a 

t 

contradiction.  ? 

v  An  important  consequence  of  the  above  theorem  is  that  many 
imoertant  charactr ist ics  of  a  live  restricted  ?M,  with  an  initial 

r 

•'  marking  Ml’  and  reachability  S’,  can  be  studied  by  dividing  Ml' 


Ml 


f**v  Ik.v;.  V'-v; 


i 

Cs 

£  into  several  initial  marking  Mli  with  a  reduced  reachability  sets 
P|  Si,  i*l,...k.  And  if,  under  each  one  of  these  initial  markings, 
the  PN  is  live,  then  the  reachability  set  S'  can  be  constructed 
£>  by  adding  all  possible  combinations  of  markings  in  the  reduced 
sets  Si. 

corollary  1:  for  a  live  restricted  PN  with  an  initial  marking  Ml, 
and  a  reachability  set  5,  if  an  initial  marking  Ml'  *  k  Ml,  for 
some  integer  k,  is  considered,  a  reachability  set  S'  is  obtained 
such  that,  for  any  Mi'  £  S', 

Mi'  *  Mjl  +  Mj2  +  ...  +  Mjk  ,  where  Mjl,  l»l,..,k,  are  in 

s. 

The  above  corollary  can  be  used  to  analyze  the  behaviour  of 
|  restricted  PNs  where  there  exists  a  place  pi  P,  called  the 
exciting  place,  such  that,  the  initial  marking  of  the  PM  is 

% 

% 

defined  by  one  or  more  tokens  in  pi  and  zero  tokens  in  all  other 
■  places.  Such  PNs  are  particularly  suitable  for  modeling  jobs 
behaviour  as  described  in  the  previous  chapter,  where  the  number 
:  of  tokens  initially  in  pi  resembles  the  number  of  jobs  that  are 

being  processed  in  the  system  (the  multiprogramming  level). 

5.3  Reduction  And  Aggregation  of  GSPNs. 

In  this  section,  reduction  and  aggregation  of  GSPNs  will  be 
considered.  By  reduction  we  mean  the  elimination  of  immediate 
transitions  that  caused  the  existence  of  transient  instantaneous 
markings  (  the  steady  state  probabilities  of  which  will  always  be 
zero).  And  by  aggregation  we  mean,  the  aggregation  of  subnetworks 
consisting  of  immediate  transitions  that  caused  the  existence  of 
ergodic  classes  of  instantaneous  markings  at  time  t  *  3. 


Figure  6.1  shows  four  examples  of  the  reduction  process, 
v  These  examples  involve  subnetworks  containing  single  input-single 

5 

output,  single  input-multiple  output,  multiple  input-single 

r> 

output,  and  multiple  input-multiple  output  immediate  transitions, 
r-  respectively.  This  process  is  done  locally  in  each  subnetwork 

s* 

without  affecting  the  rest  of  the  network,  which  is  a  very 
^  important  property.  The  multiple  input  immediate  transitions,  as 
the  ones  shown  in  figure  6.1  (c)  and  (d) ,  cannot  be  eliminated  if 
they  are  in  conflict  (i.e.  share  a  common  input  place)  with  any 
£j  other  immediate  transition.  As  was  shown  in  Chapter  4,  by  using 
such  conflicting  multliple  input  immediate  transitions,  we  are 
able  to  model  queuing  systems  with  multiple  classes  of  customers 
P  and  fixed  priority  queuing  disciplines,  which  can  not  be  modeled 
by  SPNs.  This  is  ofcourse  due  to  the  fact  that  SPNs  form  a 
•-  subclass  of  GSPNs.  The  above  reduction  process  will  be 
^  investigated  further  towards  the  end  of  this  section. 

The  aggregation  process  can  be  carried  out  on  a  class  of 
^  subnetworks  defined  by  the  following  definitions. 

Definition  4:  for  a  GSPN  *  (P,T,I,0),  with  an  initial  marking  Ml 

and  a  reachability  set  5.  A  subnetwork  N  *  (Pi, Tl,  1 1,01)  is 

,v 

•/  defined  such  that,  T1£T  is  a  set  of  immediate  transitions. 

v  PI  C  P  is  the  set  of  input  and  output  places  of  the  transitions 

£ 

in  Tl,  i.e.,  for  any  pi  £  P,  if  and  only  if 
£  I(pi,tj)  •  1  ,*  or  0  (t  j  ,pi)  *  1  ,  for  any  tj  6  Tl,' 

then  pi  £  PI. 

Also  II  and  01  are  the  input  output  functions  I  and  0  restricted 
?  to  PI  and  Tl,  i.e.. 


Ill 


IlsPIXTl— >{0, 1}  such  that  Il(pi,tj)« 


01:T1XP1 - {0, 1}  such  that  01(tj,pi) 


I(pi/tj) ,if (pi,,tj)  £  P1XT1 


0  if  not 

O(tj,oi)#if  (tj,pi)  £  T1XP1 


0  if  not 


Definition  5:  foe  subnetwork  N  defined  above,  the  set  of  places 
Pin  a  PI  and  the  set  of  transitions  Tin  C.  {T-Tl},  are  defined 
such  that  for  any  pi  £  Pin,  0(tj,pi)  *  1  for  some  tj  £  Tin.  Also 
the  set  of  places  Pout  <£  PI  and  the  set  of  transitions 
Tout  {T-Tl }  are  defined  such  that,  for  any  pi  £  PI,  with 

I  (pi , t j )  *  1  for  some  tj  £  then  P1  £  ?out  and  £  Tout. 

Transitions  in  Tin  deposit  tokens  into  places  in  the  set  Pin  of 
subnetwork  M.  And  transitions  in  Tout  remove  tokens  from  places 
in  the  set  Pout  of  N. 

The  subnetwork  N  defined  above  partitions  the  reachability 
set  S  into  two  subsets  defined  as  follows. 

Definition  6:  the  subnetwork  N  partitions  the  set  S  into  two 
subsets  SI  and  S2,  such  that  for  all  Mi  £  SI  and  all  pi  £  PI, 
Mi  (pi)  *  0,  and  for  all  Mj  £  S2,  Mj (pi)  >  0.  Moreover  the  set  $2 

can  also  be  partitioned  into  several  subsets  as  follows, 

J 

S2  *  ui*i  S 2 i  , such  that  for  any  Mk,Mj  £  S2i,  Mk(pn)  «M j (pn) 
for  all  ?n  £  {?-?!}.  Therefore,  if  Mj  is  reachable  from  Mk,  then 
all  transitions  in  the  transition  sequence  starting  at  Mk  and 
ending  at  Mj  belong  to  Tl. 

Def i n i t i on  7:  The  subnetwork  N  defined  above  is  said  to  be 
recurrent  if  for  each  S2i,  i*l,..l,  and  for  any  Mk,Mi  £  S2i,  Mk 
is  reachable  from  Mj  by  a  finite  sequence  of  transitions  in  71. 
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definition  3:  The  sets  S2i,  i*l,..,l  belong  to  equivalent  classes 
^denoted  by  the  sets  Se  1 ,  S  e  2 , . . ,  Sem ,  where  each  Sei  * 
*'{S2il,S2i2,...,S2ir} ,  such  that  the  sets  S2ij,  j*l,..,r  contain 

£exactlv  the  same  number  of  markings,  and  for  any  S2ij,S2in  6 Sei, 

,v  ■* 

there  exist  a  marking  Mi  £  S2ij  and  a  narking  Mj  £  S2in  such  that 
•  for  all  pk  6  PI,  Mi(pk)  »  Mj(pk). 

v  Each  of  the  equivalent  classes  defined  above  is  obtained 
from  a  different  initial  marking  in  the  subnetwork  N.  Therefore, 
■  if  we  define  a  marking  function  MN,  which  the  marking  M 
restricted  to  the  set  of  olaces  PL  of  subnetwork  M,  then  the  sets 

c 

in  each  one  of  the  classes  Sei,  i  ®  1 , . . ,  m ,  become 
v  indistinguishable,  and  therefore,  each  of  the  above  classes 
£  reduces  to  a  set  of  markings  defined  on  ?1  and  obtained  from  an 
initial  marking  MNli,  i*l,..,m.  These  initial  markings  are 
introduced  into  the  subnetwork  by  the  firing  of  one  or  more 
*  transitions  in  Tin  {the  set  of  input  transitions  of  N)  which 

v’ 

modify  the  markings  of  places  in  ?in(the  set  of  input  places  of 


For  a  recurrent  subnetwork,  and  for  each  initial  marking 
M^li,  the  marking  sequence  in  the  subnetwork  is  isomorphic  to  an 
j-  ergodic  discrete  parameter  Markov  chain,  and  the  steady  state 

probability  distribution  of  the  number  cf  tokens  in  each  place 

can  be  obtained  .  However,  in  order  to  analyte  a  subnetwork  in 
f  isolation  of  the  rest  of  the  network,  the  following  locality 

f-«r 

condition  must  be  satisfied. 

«  * 

.  * 

Definition  9  (locality):  For  a  subnetwork  M,  if  the  probability 

P  of  firing  a  transition  in  M  is  dependent  only  on  the  local 


markings  of  N,  then  N  is  said  to  satisfy  the  locality  condition 


Definition  10  (conservation):  for  a  subnetwork  N,  and  for  any 
initial  marking  MNtli  of  N,  if  the  total  number  of  tokens  in  M^li 


is  equal  to  the  total  number  of  tokens  in  any  marking  that 


enables  an  output  transition  in  Tout,  N  is  said  to  satisfy  the 


conservation  condition. 


The  above  definition  is  merely  stating  that  a  conservative 


subnetwork  is  a  one  which  does  not  create  (or  eleminate)  tokens 


to  (from)  the  rest  of  the  network. 


The  aggregation  of  a  subnetwork  with  immediate  transitions  is 


given  in  the  following  theorm, 


Theorem  2:  For  live  and  bounded  restricted  GSPM  B*(P,T,I,0)  with 


an  initial  marking  Ml  and  a  set  of  transition  firing  rates  R 


(defined  for  timed  transitions),  if  there  exists  a  sunbetwork  M 


*  (?1,T1,1 1,01)  as  defined  in  definition  4,  such  that. 


i)  The  set  of  input  places  Pin  contains  only  one  element,  and 


the  set  of  output  transitions  Tout  consists  of  timed  transitions, 


ii)  The  subnetwork  is  recurrent  and  satisfies  the  locality  and 


conservation  conditions. 


Then,  an  aggregated  network  3'* (?',T' ,  I '  ,0')  with  a  set  of 


:ransition  rates  R'  is  obtained  by  substituting  the  subnetwork  » 


by  one  place  pa,  such  that, 


P'  =  {?-?!}  a  {pa}  ,  T’  *  {T-Tl } , 


I’(pi,tj)  *  I (pi , t j )  ,  O'(tj,oi)  *  0  ( t j , ? i ) 


V  pi  w  P-Pl  and  tj  £  T-Tl,  and 
I'(pa,ti)  *  I(pj,ti)  ,  0 ' ( ti , pa)  *  0(ti,?j) 

V  pj  6-  PI  and  tic  T-Tl. 


V.*,*  *.*  '  •  *  • 


n  II  L  ■  P*  IWjpm 


M't*  1MW 


»V»V"  J*U*  4"  tWJ»awx»OTT« 
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And  if  ci  is  the  tate  of  an  output  transition  ti  £  Tout  of  the 
subnetwork  ,i.e.,  I(pk,ti)»l  for  some  pk£Pl,  then  the  rate  of 
such  a  transition  in  the  aggregated  network  becomes  marking 
dependent  and  is  given  by, 

E*i(j)  *  ri  .  Pi ( j )  ,  where  j  is  the  number  of  tokens  in  pa 
at  the  current  marking,  and 

Pi(j)  *  Pr  [  of  finding  at  least  one  token  in  place  pk  of  the 
subnetwork/given  j}. 

These  probabilities  are  obtained  by  solving  the  subnetwork  N  in 
isolation  for  the  steady  state  probabilities  for  each  possible 
value  of  j  which  defines  the  initial  marking  for  N.  The  rates 
of  transitions  in  {T-Tl-Tout}  remain  unchanged,  i.e.,  for  any 
ti  £  {T-Tl-Tout } ,  ri'  *  ri. 

Proof:  We  prove  the  above  theorem,  using  the  theory  described  in 
the  previous  Chapter,  by  showing  that  the  above  aggregation  is 
actually  a  state  aggregation  of  the  process  that  describes  the 
stochastic  behaviour  of  the  GSPN. 

Assuming  for  simplicity  that  there  exist  no  other  immediate 
transitions  in  the  GSPN  (other  than  the  ones  in  N) ,  then  the 
partition  defined  in  definition  6  of  the  reachability  set  S  into 
the  subsets  SI  and  S2  is  precisely  the  partition  of  S  defined  in 
equation  (5.3.1)  into  ergodic  classes  at  zero  of  the  SDMP  that 
describes  the  behaviour  of  the  GSPN.  Where  SI  contains  k 
markings,  and  S2  is  further  partitioned  into  1  ergodic  classes 
S2i,i*l,...,l  (since  N  is  a  recurrent  subnetwork).  The  matrices 
k'  and  K *  *  of  equation  (5.3.2)  are 
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r't 


521  1  + 

522  1  + 


,  and  K"  * 


S21  S22 

W1 


.  .  S21 


Where  w^  as  before  is  the  steady  state  probability  vector  of  a 
Markov  chain  with  state  space  S2i.  The  trapping  probability 
vectors  do  not  appear  here  since  there  is  no  evanescent  states. 
However/  the  subsets  S 2ij,  j*l,...,r  ,  that  belong  to  the  same 
equivalent  class  Sei  as  defined  in  definition  6  ,  will  have  the 
same  probability  vector  w^,  which  is  obtained  by  solving  the 
subnetwork  N  in  isolation  with  an  initial  marking  of  i  tokens  in 
its  input  place. 

The  rate  transition  matrix  A*  of  the  aggregated  process, 
given  in  equation  (5.3.3),  is  now  shown  to  be  the  same  as  the 
rata  transition  matrix  of  the  aggregated  network  B'.  Figure  6.2 
shows  a  transition  diagram  between  the  various  subsets  of  S. 
Where  rini  is  the  transition  firing  rate  of  a  transition  in  Tin 
enabled  by  a  marking  in  SI,  rl  is  the  rate  of  a  transition  in  (T- 
Tl-Tin-Tout }  enabled  by  all  markings  in  S21,  rin  is  the  rate  of  a 
transition  in  Tin  enabled  by  all  markings  in  S21,  and  rout  is  the 
rate  of  a  transition  in  Tout  enabled  by  a  marking  in  S21.  The 


matrices  A',,B',,C,',and  O'*  in  equation  (5.3.3)  are. 


A**  * 


-r  ini 


B"  * 


rini 


C"  «  coat 


(rl+rin)  j 

-(rl+rin) 

-  (r l*r in+rout) 


S2r+1 


Then,  the  kx(s-l)  martix  BMK'  (where  s*k+l)  will  have  the  same 
nonzero  elememts  in  B".  And  the  matrices  K"CM  and  KWD,*K’  are 
given  by. 


1  rout.wii 

2 


K'*C"  »  . 


-(rout.w^+rl+rin)  r 


Where  w^  is  the  probability  of  marking  i  in  S21  that  enables 
rout. 

Clearly  the  only  elements  affected  by  the  aggregation  are  the 
rates  of  the  output  transitions  (transitions  in  To).  In  the 
general  case,  however,  when  there  is  more  than  one  marking  in  an 
ergodic  class  that  enables  an  output  transition,  the  transition 
xate  is  multiplied  by  the  sum  of  the  probab i  1  i t i es  of  such 
markings.  Which  can  be  expressed  as  the  probability  of  finding  at 
least  one  token  in  the  input  place  of  such  output  transition.  # 


-•  -j.Xi.VJV.V.'s.v'-  A 
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The  above  theorem  defines  an  aggregation  operation  on  a  GSPN. 
The  importance  of  this  theorem  for  the  approximate  aggregation  of 
SPNs  will  be  demonstrated  in  the  next  Chapter. 

The  reduction  operation  described  in  the  begining  of  this 
chapter  can  be  generalized  for  a  class  of  reducible  subnetworks 
defined  as  follows. 

Definition  11:  A  subnetwork  N,  defined  in  def.  4,  is  said  to  be 
reducible  if,  for  each  pi  £  Pout,  there  exist  no  transition 

tj  £  Tl,  such  that  Il(pi,tj)  *  1.  Therefore,  there  exist 
markings  in  each  S2i,  i*l,..l,  that  only  enable  transitions  in 
Tout. 

The  reduction  operation  of  a  reducible  subnetwork  is  given  in 
the  following  proposition. 

Proposition  1:  for  a  live  and  bounded  restricted  GSPN 
B  »  (?,T, 1 ,0)  ,  with  an  initial  marking  Ml,  reachability  set  S, 
and  a  set  of  transition  firing  rates  R,  if  there  exist  a 
reducible  subnetwork  N  »  (PI, Tl, II, 01)  as  defined  above,  such 
that, 

i)  the  set  of  input-output  transitions  Tin  U  Tout  consists  of 
timed  transitions,  and  for  each  ti  £  Tin  (ti  6  Tout),  there 


w> 


ft- 


exists  only  one  place  Pj  £  Pin  (pj  £  Pout)  such  that. 


X  0  ( ti  ,pj )  »  1  (I  (Pj  ,  t i )  »  1),  and 

ii)  N  satisfies  the  locality  and  conservation  conditions. 

^  Then,  a  reduced  network  3'  ■  (? '  ,T ' ,  I '  ,0 ' )  is  obtained  by 

replacing  N,  except  for  its  places  in  Pout,  by  a  set  of  timed 


transitions  Ta  such  that, 


?•  *  {P-Pl}  U  P out,  T'  »  [T-Tl }  V  Ta, 
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for  all  tj  $  {T-Tl}, 

I'(pi,tj)  *  Kpi/tj),  O'  ( tj  ,pi)  .*  0  (tj /pi)  j  V  pi  £  {P-Pl}, 
IMpi/tj)  *  I  (pi / tj),  V  pi  6  Pout, 
foe  each  ti  £  Tin,  and  foe  Pout  *  {pol,po2,..,pox} ,  define  new 
teansitions  til, ti2,..,ti  (x-1)  £  Ta  such  that 

O'(ti,pol)  *  1,  O'(til,po2)  *  1,  0 '  { t  i  2 ,  po  3 )  »  1, . ,and 

O' { ti (x-1) ,pox)  »  1.  Also 

i 

0'(tis,pc)  *  0(ti,pe),  I'(pc,tis)  ■  I(pe,tis)  j  pc  £  {p-Pl}, 

s  »  l,...,(x-l). 

(each  ti  6  Tin  is  connected  to  the  ficst  place  in  Pout,  and  foe 
each  ti,  (x-1)  teansitions,  tis,  s»l,.«.,  (x-1) ,  aee  defined  in  Ta 
that  have  the  same  input  and  output  places  in  {P-Pl}  as  ti.  Each 
tis  also  has  po(s+l)  £  Pout  as  an  output  place.  Ta  is  the  set  of 
all  teansitions  tis,  s*l, . . . , (x-1) ,  defined  foe  each  ti) . 

Also  the  set  S'  is  defined  as  follows, 

foe  each  timed  teansition  tk  £  {T-Tl-Tin},  c'k  »  ck,  and 
foe  each  ti  £  Tin  and  its  coecesponding  tis,  s*l,...,x-l,  in  Ta, 
c'i  *  ri  .  Pi  (1) ,  r'is  »  ei  .  Pi(s  +  1)  for  s  *  l,....,x-l, 
where  ?i(j),  j  ■  l,...,x,  are  the  trappping  probabilities  of  a 
token  in  the  output  place  of  ti  in  the  set  Pin  to  each  one  of  the 
places  in  Pout,  respectively. 

Proof :  we  prove  the  above  proposition,  using  again  the  theory 
described  in  the  previous  chapter,  by  showing  that  the  above 
reduction  operation  corresponds  to  neglecting  evanescent  states  in 
the  process  that  describe  the  behaviour  of  the  G5PM. 

In  figure  6.3(a),  let  ti  be  a  transition  in  Tin  with  rate  ri. 
Place  pj  is  a  place  in  Pin  such  that  0(ti,pj)  ■  1.  The  set  of 


v'  ll' '  lv\' •  >■>>■  fl/’f, 
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places  {pol,po2,...pox)  is  the  sat  of  output  places  Pout.  And 
eoir  i»l,...,x,  ara  the  rates  of  output  transitions  in  Tout. 
Assuming  for  simplicity  that  there  exist  no  other  immediate 
transitions  in  the  GSPN  (other  than  the  ones  in  N) .  Let  Mil  be  a 
marking  that  enables  ti,  SN  *  {Mjl,Mj2,...,Mjk}  be  the  set  of 
markings  that  enable  transition  in  Tl/  and  the  set  SO  * 
(Mol, Mo2,.., Mora}  be  the  set  of  marking  with  at  least  one  token  in 
in  any  one  of  the  places  in  Pout.  Considering  for  simplicity 
markings  with  only  one  token  in  any  one  of  the  places  of  figure 
6.3  (a)/  and  let  Mjl  be  the  marking  in  SN  with  one  token  in  pj. 
Then  the  blocks  VN  and  UN  of  matrices  U  and  V  that  correspond  to 
the  above  sets  of  markings  are  given  by/ 


Mil  Mol  Mo2  .  .  Mox 


MOX _ 

Mjl  0  Pi  (if 
Mj 2  0  P2 (1) 


.  PiTxl 
.  P2 ( x) 


Mjkla  Pk  ( 1) 


.  Pk (x) 


Mil  Mol  Mo2  .  .  MOX  Mjl  Mj2  .  .  Mik 

Mil  r  I 

Mol  J 

UN  »  Mo 2  I  j  0 


where  Pi(s)  is  the  trapping  probability  from  Mjl,  and  ?m(s)  is 
the  trapping  probability  from  Mjl/  1  -  2/../k/  to  Mos,  s  » 
1/.../X.  Also  the  corresponding  block  AIN  of  Al  is, 


The  corresponding  block  A'N  of  A*  is,  therefore,  given  by, 

Mil  -ri  ri.Pi(l)  ri.Pi(2)  .  .  ri.Pi(x) 

Mol  -rol 

Mo2  -ro2 

A'S  » 

•  • 

Mox  -rox 

Clearly  the  the  rats  ri  of  the  input  transition  is  mutiplied 
by  the  trapping  probabilities  from  Mjl  only  to  markings  that 
belong  to  SO.  Figure  6.3(b)  shows  the  reduction  operation  which 
produces  the  same  matrix  A'N.  # 

Example:  Consider  the  GSPN  in  figure  6.4(a),  the  subnetwork  of 
immediate  transitions  consists  of  reducible  and  recurrent  parts. 
Using  theorem  1,  the  recurrent  part  can  be  aggregated  first  to 
places  pal  and  pa2.  And  the  rates  of  transitions  t6  and  t7  are 
multiplied  by  the  appropriate  probabilities.  Then  using 
proposition  1,  a  reduction  operation  can  be  done  on  the  remaining 
reducible  subnetwork  as  shown  in  figure  6.4(b).  Where  P j (k)  is 
the  trapping  probability  from  a  token  in  oij  to  pak,  j,k»l,2. 


CHAPTER  7 


APPROXIMATE  AGGREGATION  OP  SPNs 

7.1  Overview 

In  this  chapter  the  analysis  of  SPNs  by  approximate 
aggravation  and  lumping  is  considered.  In  section  7.2,  the 
approximate  hierarchical  aggregation  of  SPNs  is  demonstrated  by 
several  examples.  And  in  section  7.3,  the  approximate  lumping 
parallel  transitions  in  a  SPN  is  considered. 

7.2  Hierarchical  Aggregation  of  SPNs 

The  analysis  of  SPNs  with  transition  rates  of  different 
orders  of  magnitude  can  be  greatly  simplified  using  approximate 
aggregation  techniques.  In  this  section,  the  analysis  of  such 
SPNs  will  be  considered.  And  we  demonstrate  by  several  examples 
that  the  analysis  of  singularly  perturbed  SPNs  can  be  reduced  to 
the  analysis  of  that  of  a  hierarchical  sequence  of  subnetworks, 
each  of  which  is  valid  at  a  certain  time  scale.  Since  the  time 
behaviour  of  a  SPN  is  isomorphic  to  a  continuous  time  MC,  the 
hierarchical  aggregation  of  SPNs  is  equivalent  to  that  of  MCs 
described  in  chapter  4.  However,  as  was  the  case  for  queuing 
networks,  such  aggregation  at  the  SPNs  level  is  much  more 
advantageous  than  the  aggregation  of  the  correspoding  MC  when  the 
state  space  is  very  large.  This  is  because  at  the  SPN  level  we 
are  dealing  with  the  aggregation  of  subnetworks,  whereas  at  the 
MC  level  we  aggregate  groups  of  large  number  of  states. 

The  exact  aggregation,  defined  in  the  previous  chapter  for 
subnetworks  consisting  of  immediate  transitions  is  a  GSPN,  can  be 
employed  for  subnetworks  consisting  of  fast  transitions  in  an 


SPN.  However,  the  aggregation  is  approximate  since  fast 
transitions  have  very  large,  yet  finite,  firing  rates  compared  to 
slow  transitions.  The  following  example  illustrates  the  above 
concept . 

Example  7.1:  consider  the  SPN  shown  in  figure  7.1,  where 

« 

rl,r2,r3,  and  r4  are  large  compared  to  r5  and  r6.  Considering 
large  transitions  only  with  input-output  places,  the  SPN  is 
decomposed  into  the  recurrent  subnetworks  Ml  anf  N2  shown  in 
figure  7.2(a).  Using  the  theory  developed  in  the  previous 

chapter,  these  subnetworks  can  be  aggregated,  and  an  aggregated 
3PM  with  slow  transitions  can  be  obtained  as  shown  in  figure 
7.2(b).  Where  r'5(i)  and  r'6(j)  are  state  dependent  rates  given 

r'5(i)  =*  r5  Pl(i)  ,  and  r'6(j)  *  r6  P2(j),  where 
?i(i)  =  ?r[of  finding  at  least  one  token  in  place  ?2  of  M1  when 
there  are  i  tokens  in  place  pal] ,  and 

P2(j)  =  ?r[of  finding  at  least  one  token  in  place  p4  of  M2  when 
there  are  j  tokens  in  oa2]. 

The  above  probab i 1 i ties  are  obtained  by  solving  subnetworks 
Ml  and  M2  for  all  possible  markings  of  the  aggregated  SPN,  then 
?1(!)  =  r 1/ ( r l+r2 )  ,  P2(l)  =  r3/(r3+r4), 

?1 (2 )  *  rl  r2/(rl2*rir2+r22) ,  P2(2)  *  r3  r4/(r32+r3r4*r42) 


The  rate  transition  matrix  .V  of  the  aggregated  SPN  is  given 


r ' 5 (1) 
r’S (2) 

probabilities  of  the 


P 


2 


Figure  7.2 


p  m  »  v»  fc  *  v'*'  v 


%•;  markings  of  the  aggregated  S?M. 

S* 

To  show  that  the  above  is  equivalent  to  the  approximate 
£  aggregation  of  the  isomorphic  MC  of  the  S?N  in  figure  7.1,  the 
reachability  set  of  this  SPM  and  the  correspcding  rate  transition 
matrix  are  given  by, 


-xi  r  4 


r  5  3 


-rl  rl 


3  3 


r2  -:<? 


3  3 


r  5  3 


3  r  6 


Where  xi  is  the  sum  of  the  all  elements  in  row  i,  i*l....L3. 
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Since  the  rl,r2,r3,and  c4  >>  r5  and  r6,  the  above  matrix  can 
be  written  as, 

A  *  A (p)  *  A3  +  p  Al,  where  p  *  r5  +  r6  ,  is  the  maximum 

degree  of  coupling  between  aggregates  [COUR  771,  and 

A0  *  A (0) ,  is  obtained  by  letting  e5  and  r€  equal  3  in  A.  Then 
from  theorem  1  of  chapter  4,  and  since  rank  A(p)  >  rank  A0  (i.e, 
the  process  is  singularly  perturbed),  the  steady  state  transition 
probability  matrix  Pr  of  the  MC  is  given  by, 

Pr  3  limt_^  a*  exp{A  t}  3  0  exp{A"  p  t}  V,  where 
Z  =  limt.^exp(A3  t}  »  V.CJ  is  the  canonical  decomposition  of  Z, 
and  p  A"  ®  U  p  Al  V  ,  is  the  rate  transition  matrix  of  the 
aggregated  process.  It  can  be  easily  shown  that  this  matrix  is 
the  race  transition  matrix  A'  of  the  aggregated  SPN  of  figure 
7.2(b). 

The  approximate  aggregation  of  the  SPN  of  figure  7.1  produces 
a  hierarchical  decomposition  of  the  SPN  at  two  time  scales.  At 
the  fast  time  scale  t  the  SPN  reduces  to  subnetworks  Nl  and  N2  of 
figure  7.2(a).  And  at  the  slow  time  scale  t/p,  the  aggregated  SPN 
of  figure  7.2(b)  is  obtained.  The  current  marking  of  the 
aggregated  SPN  at  the  higher  level  of  the  hierarchy  determines 
the  number  of  tokens  in  the  subnetworks  at  the  lower  level,  which 
are  solved  to  determine  the  rates  of  transitions  at  the  higher 
level.  Such  hierarchical  decomposition  is  symptomatic  of 
singularly  perturbed  SPNs  defined  as  follows, 

Definition  1:  a  SPN  with  fast  and  slow  transitions  is  said  to  be 
singularly  perturbed,  if  and  only  if  the  corresponding  MC  is 
singularly  perturbed,  i.e.,  if  rank  A  >  rank  A0,  where  A  is  the 
rate  transition  matrix  and  A3  is  the  matrix  A  when  all  slow 


transitions  are  set  to  zero. 

In  the  rest  of  this  chapter  the  analysis  wil  be  focused  on 
irreducible  SPMs  defined  as  follows, 

Definition  2:  a  S?N  *  (?,T,I,0),  with  an  initial  narking  Ml  and  a 
reachability  sat  S,  is  said  to  be  irreducible  if  it  is  live  and 
recurrent,  i.e.,  if  for  any  ti f  T  and  for  all  Mj  £  S,  there 
exists  a  transition  firing  sequence  from  Mj  in  which  ti  fires. 
And  if  for  any  Mj,Mk  £  S,  Mk  is  reachable  from  Mj. 

The  recurrence  of  markings  and  the  liveness  issues  are 
related,  however  they  are  not  equivalent.  It  is  clear  that  a 
recurrent  ?N  is  not  necessarily  live,  and  not  all  live  and 
bounded  PMs  are  recurrent.  Molloy  [MOL31]  proved  that  any  live 
and  bounced  ?N,  with  a  reachability  set  S,  has  a  unique  subset  of 
recurrent  markings  S'  C.  S.  Which  entirely  describes  the  steady 
state  behaviour  of  the  SPN.  In  the  above  definition  of  an 
irreducible  SPN  S’  =  S. 

The  following  proposition  characterizes  singularly  perturbed 
SPNs. 

Prooos i t ion  7.1:  an  irreducible  SPN  with  fast  and  slow  transitions, 
and  a  reachability  set  S,  is  singularly  perturbed  if  and  only  if 
one  of  the  following  conditions  is  satisfied, 

i)  There  exists  more  than  one  marking  in  3  which  enable  only 
slow  transitions, 

ii)  There  exist  at  least  two  disjoint  recurrent  subnetworks 
consisicing  of  fast  transitions,  or 

iii)  There  exist  at  least  one  marking  as  defined  in  i),  and 
one  subnetwork  as  defined  in  ii). 
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Proof:  from  definition  1,  a  SPN  is  singularly  perturbed  if  and 
only  if,  in  its  corresponding  MC  rank  A  >  rank  A0,  or 
equivalently  nul.A  <  nul.A0.  Since  for  an  irreducible  SPN 
nul.A*l,  then 

a)  if  condition  i)  is  satisfied,  then  clearly  nul.AJJ  >  1  (there 
exist  more  than  one  row  of  zeros  in  A0). 

b)  if  condition  ii)  is  satisfied,  then  each  subnetwork  will 
produce  at  least  one  block  diagonal  matrix  in  A0,  the  nulity  of 
which  will  be  equal  to  1.  And  therefore  nul.Afl  >  1. 

c)  clearly  from  the  above  if  condition  iii)  is  satisfied,  then 
again  nul.A0  >  1. 

We  prove  the  necessity  of  the  above  conditions  by 
contradiction  as  follows.  Suppose  that  none  of  the  above 
conditions  are  satisfied,  yet  nul.A0  >  1.  Then  considering  only 
fast  transitions,  there  exist  more  than  one  ergodic  classes  of 
markings  Si  ,  i *1 , 2 , .  Let  E  be  the  set  of  all  ergodic  classes, 
if  any  Si  £  E  contains  more  than  one  marking,  then  these  markings 
must  be  reachable  from  each  other  by  fast  transitions 
(recurrent).  But  since  condition  ii)  is  not  satisfied,  then  there 
exist  at  most  one  Si  £  S  with  more  than  one  marking.  And  each  of 
the  remaining  classes  consists  of  a  single  marking.  Mow,  since 
conditions  ii)  and  iii)  are  not  satisfied,  then  these  single 
markings  do  not  enable  any  slow  transition.  And  no  other 
transition  in  the  SPN  is  enabled.  Then  nul.A  >  1,  i.e.  the  S?N  is 
is  not  live  which  is  a  contradiction.  # 

As  mentioned  before,  the  hierarchical  decomposition  of  at 


different  time  scales  of  perturbed  SPNs  is  symptomatic  of 


singular  perturbation.  The  analysis  of  regularly  perturbed  SPNs 
reduces  approximately  to  the  analysis  of  the  network  under  fast 
transitions  only.  We  illustratees  the  above  concepts  by  the 
following  examples. 

Example  7.2:  Consider  the  SPN  shown  in  figure  7.3(a)/  where  the 
rates  of  slow  transitions  are  modeled  by  a  small  parameter  p<<l. 
The  reachability  set  is  given  by 
pi  p2  p3  p4 
Ml  10  0  0 

M2  0  1  0  0 

M3  0  0  1  0 

M4  0  0  0  1 

From  proposition  7.1,  this  subnetwork  is  singularly  perturbed 
since  marking  Ml  amd  M4  enable  only  slow  transitions.  At  the  fast 
time  scale,  the  SPN  reduces  to  the  subnetwork  consisting  of 
transitions  t3  and  to,  and  their  set  of  input  output  places  ?1  * 
{pi,p2,p3}.  However,  as  shown  in  figure  7.3(b),  slow  transitions 
tl  and  1 4 ,  with  input  output  places  in  Pi,  are  also  included  in 
the  subnetwork.  Although  these  slow  transitions  can  be  neglected 
at  the  fast  time  scale,  their  inclusion  here  is  for  improving  the 
acurracy  of  the  approximation.  This  subnetwork  is  now  recurrent 
and  can  be  aggregated  into  one  place  pal.  Figure  7.3(b)  shows  the 
aggregated  SPN  at  the  slow  time  scale  consisting  of  slow 
transitions  only.  This  SPN  can  be  solved  for  the  probabilities  cf 
the  markings  Ml'  and  M2'  defined  as  follows, 
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pal  p3 


M2 1  3  1 

Thus  P(Ml')  *  2/3  ,  and  P(M2')  »  1/3  ,  and  from  the  subnetwork  in 

'y  figure  7.3(b)  , 

1  + 

P  (Ml)  -  P(Ml’)  .  l/(H-p+p2)  ,  P  (M2)  »  P(Ml')  .  p/(l+p+p2)  , 
u  P(M3)  «  P(Ml’)  .  p2/(l+p+p2)  ,  and  P(M4)  -  P(M4’) 
r’;  It  can  be  easily  seen  that  without  the  inclusion  of  slow 

r. 

tramsiticns  tl  and  t4  in  the  fast  time  scale, the  probabilities 
i  P(M2)  and  P(M3)  would  be  zeros. 

':}  Example  7.3:  Consider  the  SPN  shown  in  figure  7.4(a),  where  o  << 
1.  Again  this  SPN  is  singularly  perturbed,  since  there  exist  two 
^  recurrent  subnetworks  Ml  and  M2  with  fast  transitions.  Where  Ml  * 
{p3,p2,t3,t4}  and  M2  ■  {p4,p5 , t6 , t7 } .  At  the  fast  time  scale  t, 
the  SPM.  is  decomposed  into  the  subnetworks  shown  in  figure 
7.4(b).  Where  again  the  slow  transition  t5  is  included  since  its 
input  and  output  places  p2  and  pi  belong  to  the  set  of  input 

> 

output  places  of  fast  transitions  in  one  of  the  subnetworks  in 
figure  7.4(b).  The  inclusion  of  this  transition  in  this  example 
is  crucial  since  the  probability  of  finding  a  token  in  pi  would 

f 

be  zero  otherwise.  And  since  pi  is  an  output  place  of  this 
subnetwork  to  the  rest  of  the  network,  the  rate  of  the  output 
transition  tl  depends  on  the  above  probability. 

i 

At  the  slow  time  scale  t/p  the  aggregated  SPM  shown  in  figure 
7.4(c)  is  obtained.  Where  the  subnetworks  in  figure  7.4(b)  are 
aggregated  into  places  pall  and  pa!2.  This  SPN  is  also  singularly 
perturbed.  Since  it  consists  of  slow  and  fast  transitions  (  o  and 


p2).  And  there  exists  a  recurrent  subnetwork  of  fast  transitions 
{ pal2,p6,  t8 ,  t9 } ,  3nd  a  marking  that  only  enables  the  slow 
transition  cl.  Therefore  at  time  scale  t/p  it  reduces  to  the 
subnetwork  of  fast  transitions  shown  in  figure  7.4(d).  and  at 
time  scale  t/p2,  the  aggregated  SPN  of  slow  transition  shown  in 
figure  7.4(e)  is  obtained.  Therefore,  the  networks  in  figures 
7.4(b,d,e)  are  the  time  scale  decomposition  of  of  the  SPN  in 
figure  7.4(a)  at  t,  t/p,  and  t/p2  respectively. 

7.2  Approximate  Lumping  : 

In  this  section,  another  method  that  reduces  the  analysis 
complexity  of  SPNs  will  be  discussed.  Since  SPNs  are  isomorphic 
to  MCs,  state  lumping  defined  for  MCs  can  be  implemented  on 
subnetworks  of  SPNS.  We  first  review  the  notion  of  "lumpabili ty" 
(KEM  67,  CO(J  77,  DEL  84]  of  an  irreducible  finite  state  Markov 
process  (FSM?) ,  and  demonstrate  by  an  example  the  application  of 
this  notion  to  SPNs. 

Let  xt  be  a  discrete  parameter  homogenous  MC,  with  state 
space  S« ( l,2,...,n} ,  and  a  transition  probability  matrix  P.  Let 
{Ql,Q2,*«.*,Qa)  be  a  partition  of  the  set  E.  Each  subset 
Qi ;  i*l,2,....,R  can  be  considered  as  a  state  cf  a  new  process. 
Let  3 b  denote  the  state  occupied  by  this  new  process  at  time  t. 
The  probability  of  a  transition  occuring  at  time  t  from  state 
to  state  Qj,  ?'QiQjU),  is  given  by, 

P’QiQj  *  s.-Qj/  st-2aQk' - s0*Qm} • 

The  original  MC  is  thus  reduced  to  a  stochastic  process  with 

fewer  states.  The  new  process  is  called  a  lumped  process,  and  Ci 


a  lumped  state. 
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»  , 

The  lumped  process  is  again  a  homogenous  MC  only  if  it  is 
Jtime  independant  and  depends  only  on  st_^,  i.e. 


'v  p,QiQj(t)  *  Pr{  Sts<3j/  st-l“Qi} 
*  pQiQj 


(7.2.1) 


Kemeny  and  Snell  [KEM  671  qualify  xt  as  being  lumpable  if 
.equation  (7.2.1)  is  true  foe  every  possible  initial  state.  Defining 

^  piQj  *^6Qj  pij 

as  the  probability  of  moving  from  state  i  to  set  Qj,  then  the  MC 
is  lumpable  with  respect  to  the  partition  if  and 

®  only  if  all  the  have  the  same  value  for  every  if  Qj,,  and  for 

;/  any  given  Qj  j  Qj,. 

Similarly,  for  a  continuous  time,  homogeneous  FSMP 
®  { x (t) ,  t>0 1 ,  with  a  state  space  E,  and  a  rate  transition  matrix  A. 
Then  x(t)  is  lumpable  with  respect  to  a  partition 


(Ql»Q2'  •  •  •  •  »Qr}#  if  and  onlV’  i  * 


AU  *(JA 1 


(7.2.2) 


where  rJ  is  (n,R)  partition  matrix,  whose  (i,Qj)  entry  is 


'iQj  * 


if  i  €  Q; 

otherwise 


V  i  f  2 ,  j  *1,2, «...,R 


A'  is  the  rate  transition  matrix  of  the  lumped  process  { x '  ( t) , 
t>0}.  Equation  (7.2.2)  can  be  written  as 

£  Qk  a  ik  *  a  'QjQk  ,  V  *  £  Q j  '  k-1 , 2 , . , . . R. 

That  is  the  rate  of  going  from  j  6  Qj  to  group  Q^  depends  only  on 
Qj  and  is  independant  of  i. 

As  an  exam pie, consider  the  MC  shown  in  figure  7.5(a). 

If  a3-  *a45*  a  ,  then  a  Lumped  MC  can  be  obtained  as  shown  in 


Figure  7.5 


(&) 


Figure  7.6 


Figure  7 


h-Ot 


figure  7.5  (b) ,  where 


a  2  *  a->  ^  *  a2f  4  /  and 

a  { 3 , 4 } ,  5  a  3 

We  demonstrate  the  application  of  the  above  concept  to  SPNs  by 
the  following  example. 

Consider  the  SPN  shown  in  figure  7.6(a),  the  reachability  set  of 
which  is  given  by, 

10000 

M2  0  110  0 

M3  0  10  0  1 

M4  3  0  1  1  0 

M5  3  3  0  1  1 

Let  us  investigate  the  probability  of  lumping  the  subnetwork 
{p2,p3/ t2/t 3 }  consisting  of  the  parallel  transition  t2  and  t3 
together  with  their  input  output  places,  to  obtain  the  lumped  SPN 
shown  in  f igureT . 6 (b) ,  the  reachability  set  of  which  is 


M'i  100 

M '  2  3  13 
M'3  0  3  1 

Clearly,  we  are  investigating  the  lumping  of  marking  M2,  M3,  and 
m4  into  M ' 2*  Let  x  be  a  random  variable  representing  the  firing 
time  of  transition  t'2.  it  can  be  easily  seen  that 

x  »  max  (xi_,x2)  ,  where  :<i  and  :c2  are  r.v.s  representing  the 
firing  times  of  transitions  t ^  and  t2  respectively.  Since  and 


X2  are  independant  and  exponentially  distributed  withrates  r^  and 


uAlA  ^  wWajA  -■  ^  m  A  -  >  UVUl  A  . 


•"  •*.  ^  a  •  t  m  I  *  .  *  •  *  •  "  v"  *  *, 


r 2  •  then , 

P  {x<  t}  *P  {xL<  t}.  ?  U2<  t} 

3  (i-e"c2t) . (l-e“r2c) 

Therefore/  even  if  r  *r2,  x  is  not  exponentially  distributed, 
and  since  by  the  definition  of  a  SPN,  every  transition  must  be 
associated  with  an  exponentially  distributed  firing  time,  the 
lumped  network  in  figure  7.6(b)  is  not  defined.  However,  we 
can  approximate  the  distribution  of  x  by  an  exponential  distribution 
with  rate  r,  such  that  the  folloing  integral  is  minimized, 

Minc  (  (l-e“c2t)  (l-e“c3t) - (l-e"ct) ] 2  dt 

,r,r2/r3  >  3 

Then  2r2  (r+r3)  2  (r2  +  r3-»-c)  2  +  2r2  (r+r2)  2  [2r2  (r  +  r3)  +r22  I  - 

1/2  (r+r2) 2 (r*r3) 2 (r+r2+r3) 2*3 
which  can  be  solved  iteratively  for  r. 

Let  r^  *r2  *r2  *1,  r4  3  oo  (i.e.  t4  is  an  immediate  transition) 

Then,  from  the  above,  r  *  3.549.  Solving  the  lumped  S?M  for  the 
steady  state  probabilities  of  its  marking,  we  get 
P (M 1 L)  *  3.394  ,  ?(M’2)  *  3.606  ,  (?(M'3)  *3) 

And  solving  the  original  SPN  , 

POM  »  0.4  ,  ?(M2)  rP(M3)  +P(M4)  *  3.5 
For  rL  *1,  r2  -2,  r3*  3,  r4*  4,  then  r  »  1.546  and 
POl'j.)  -3.5271  ,  P  (M ’ 2)  -3.3  41,  P  (M* 3)  =  3.1317,  ?  (M  L) -.5339, 

P (M2) +? (M3) +9 (M4) *3. 336,  ?{M5)-  3.1327,  error  *  0.7% 

If  the  transition  rates  are  st3te  (Marking)  dependant,  then 
the  above  will  not  hold,  since  x  2  and  x2  are  no  longer 
independant.  For  this  case,  let  us  consider  the  lumped  3?M  shown 
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in  figure  7.7  .where  the  subnetwork  of  parallel  transitions  t2 
•^nd  tj  are  lumped  into  a  subnetwork  consisting  of  the  two  serial 
.^transitions  t'2  and  t'3«  The  reachability  set  is  given  by, 

H'l  13  0  3 
>>  M '  2  3  10  0 


-  M' 


3  0  10 
0  0  0  1 


'“in  this  case,  marking  M3  and  M4  were  lumped  into  M'3.  Therefore, 
j£  r  '  2  =tr2  (2)  *c3<2) 

,  wheca  e;(2)  ;i*2,3  ,  ara  the  transition  ratas  of  to  and  t3  at 

V, 

marking  M2  • 

|  Mow  let  x  be  a  r.v.  representing  the  firing  time  of 
transition  t ' 3 ,  then 

?  ( x  <  t )  a  ?{xL<t)r3(2)/[r2(2)+r3(2)l+?(:<2<  t)c2{2)/[r2(2)+r3(2)] 
^  *l-r3 (2) a"r2 */ [ r 2 (2) +  r3  (2) ] 

-r2  ( 2 )  e"r3  (4)  V[r2  (2)  rc3  (2)  ] 

i  which  reduces  to  the  exponential  distribution  in  two  cases  ;  case 
1  when  -2at3  and  case  2  when  either  r2  or  03  tends  to  infinity. 
Define  r ' 3  as  the  mean  of  the  above  distribution,  then, 

C  r  ’  3  -r3 (2) / (r2  (3) )  (z2 (2) ^r3 (2) ) *r2 (2) /(r3 (4) )  (r2 (2) +r2 (2) ) 


above 

distribution  with 

an 

3  3/ 

then  the  lumped  3?M 

can 

ity  dist 

rib  u t i 0  n . 

3,  r 2 ( 3 ) 

*6,  r  3 ( 2 )  =2,  r  3  (  4 ) 

*4 , 

Then  r ' 2  *0  and  r ' 3  “4.615 

Then  *3.7353,  ?  (M ' 2 ) =3 . 14117 ,  ? (M ' 3 ) =0 . 15295 

^.The  exact  solutionis,^ 


V\ 

PP  P  (Mi)  *0.7053  ,  P(M2)-«-l4i16  '  P(*3>+P(M^s0-1528S 
''and  foe  tj,  -1,  t2(2)-4,  r2(3)-6,  r3(2)=6,  r3(4)-10,  r4- 

$Then  e‘2*3.3,  r'3a7-1428 

:>P(M'l)*0.3376,  P(M’2)*0.03976'  ?  CM'  3)  s3<Ul54 

;X  1 

and  the  exact  solution  is 

iiP<«1)»0.8064,  P(M2>*. 08964,  P  (M3)  (M4)  *• 1123  '  1,l%  e£t°E 
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CHAPTER  8 

OTHER  APPLICATIONS 


8.1  Overview 

In  this  chapter  two  other  applications  of  GSPNs  as  an 
V.  analytical  modeling  tool  will  be  considered.  In  section  2, 
analytical  models  for  systems  Reliability  and  Maintainablity 
(R&M)  are  considered.  In  section  3,  a  model  for  systems  fault 
diagnosis  is  considered. 


8.2  Modeling  Systems  Reliability  and  Maintainability 

The  development  of  cost-effective  analytical  models  for 
complex  systems  R  &  M  has  been  an  active  area  of  research  [DOL 
33,  FLE  84a,  34b,  35].  Generally  speaking,  the  model  can  be 
defined  on  the  basis  of  the  following  information  :  system 
structure,  maintenance  description,  module  or  component  data.  A 
complex  system  can  often  be  divided  into  subsystems  and  modules. 
A  module  is  a  replicable  subset  of  a  system  chosen  to  be  modeled 
as  a  unit,  on  which  system  design,  input  data  and  maintenance 
policies  are  based.  Subsystems  are  sets  of  dependant  modules 
which  are  redundant,  or  with  interdependant  maintenance  strategy. 
Subsystems  are  usually  defined  to  be  independant  and  connected  in 
series.  Thus  a  failure  of  a  subsystem  would  cause  system  failure. 
Maintenance  description  includes  facilities,  space  modules,  test 
equipment,  and  maintenance  policies  such  as  inspection, 
replacements,  and  repair  procedures.  Module  data  includes  failure 
cates,  repair  rates,  fraction  of  faults  detected,  fraction  of 
faults  isolated,  and  false  alarm  rate. 
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The  authorsin  [  FLE  84b)  appeared  to  be  the  first  ones  to 
develop  a  MC  model  which  charactar izes  all  key  elements  insystem 
r  &  m  analysis.  The  state  of  the  MC  chain  model  is  defined  by  the 
number  of  operational,  or  failed  modules  and  the  maintenance 
actions  in  progress.  Even  though  a  MC  can  be  constructed  for  each 
subsystem,  the  model  becomes  intractable  for  complex  systems  due 
to  state  space  explosion.  In  this  section  we  consider  the  use  of 
GSPNs  for  modeling  systems  R  i  M.  GSPNs  allow  the  activities  to 
be  modeled  at  a  high,  level  of  abstraction.  In  other  words,  GSPNs 
provide  a  more  detailed,  yet  simpler,  description  of  system 
activities.  By  detecting  activities  with  duration  of  different 
orders  of  magnitude,  approximate  aggregated  models  can  de 
obtained.  This  aggregation  further  reduces  the  complexity  of  the 
analysis.  The  above  concept  is  demonstrated  by  the  following 
example . 

Example  3.1:  Consider  a  system  which  consists  of  several  redundant 
modules.  A  built-in  testing  (BIT)  unit  monitors  the  operational 
status  of  all  modules,  A  module  that  fails  and  its  failure  is 
covered  ,  i.e.  detected  and  isolated,  by  the  3IT,  is  removed  from 
the  system  and  sent  to  the  shop  for  repair.  The  removal  is 
considered  as  a  maintenance  action  at  the  organizational  ("0") 
level  [FEL  84b).  A  failed  module  that  was  not  immediatiy  detected 
will  produce  an  error  after  some  time.  The  system  will  then  be 
inspected  and  the  failed  module  will  be  removed  and  sent  to  the 
shop  for  repair.  The  inspection  and  the  removal  of  the  module  is 
considered  as  another  maintenance  action  at  the  ”0”  level.  A 
fault  that  was  detected  but  could  not  be  isolated  causes  a  system 
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failure.  In  this  case,  all  modules  are  sent  to  the  shop  for 
repair.  The  input  parameters  needed  by  the  model  are  summarized 
£  as  follows  : 


•7 


*  No.  of  modules 
Mu  *  module  failure  rate 
FA  *  false  alarm  rate 

r^  *  mean  time  for  "0"  level  inspect  and  remove 
t2  *  mean  time  for  "0"  level  remove 
ci  *  mean  time  for  repair 


10*5  failures/hr. 

varies 
2  hrs. 

1  hr. 

72  hrs. 

varies 

varies 


d  *  fraction  of  faults  detected 
i  *  fraction  of  faults  isolated 

FD  *  mean  time  to  detect  an  immediately  undetected  fault  100  hrs. 

The  above  parameters  are  taken  from  an  example  given  in  [FEL 
84b]  for  a  mission  computer  in  an  air  craft  radar  set.  The  false 
alarm  rate  is  defined  as  the  rate  at  which  the  BIT  detects  a 
fault  while  the  system  is  fault-free.  Operational  data  of  some 
systems  show  that  often  50%  of  all  maintenance  actions  are  due  to 
false  alarms.  Figure  8.1  (a)  shows  a  GSPN  which  models  system  A  & 

M  of  the  above  example.  The  initial  marking  of  the  GSPN  indicates 
that  there  are  two  modules  installed  and  active,  no  maintenance, 

and  the  system  is  up.  An  important  advantage  of  using  the  GSPN  is 

fit 

that  the  graohical  complexity  does  not  change  with  the  increase 

fV 

■>,  of  the  number  of  modules  in  the  system.  Note  that  the  component 
failure  rate  and  false  alarm  r3te  are  much  smaller  than  detection 
and  repair  rates.  An  aproximate  aggregated  GSPN  can  thus  be 

•  * 

!v  obtained  (figure  8.1(b)).  In  figure  3.1  (b)  ,  r^  *  (?A*2Mu  d)i, 

■  r2*rl(i“l)/i,  3n<*  c3*  2Mu(l-d).  The  basic  assumption  in  the 


aggregated  GSPN  is  that  a  false  alarm  of  a  second  failure  is  not 
posible  with  one  failure.  Such  a  model  can  then  be  analyzed  to 
obtain  system  unavailability,  which  is  the  probability  of  finding 
a  token  in  F,  and  the  average  number  of  maintenance  action  per 
million  hours.  The  approximation  was  found  to  be  accurate  for 
imperfect  isolation  (i<l). 

Figure  3.2  shows  the  impact  of  false  alarm  on  system 
unavailability  for  differnt  values  of  detection  and  isolation 
factors.  The  false  alarm  is  increased  up  to  a  value  where  50%  of 
all  maintenance  actions  are  due  to  false  alarms.  It  demonstrates 
that  the  isolation  factor  plays  a  key  role  in  the  impact  of  false 
alarm  on  unavailability.  The  higher  the  isolation  factor  is,  the 
lower  is  the  impact  of  false  alarm  on  unavailability.  This  is 
because  an  alarm  which  could  not  be  isolated  causes  a  system 
failure.  The  detection  factor,  on  the  other  hand,  does  not  seem 
to  have  significant  effect  on  the  impact  of  false  alarms. 
However,  the  lower  the  value  of  d  in  a  system  with  falsa  alarms, 
the  lower  is  the  unavailability.  This  is  because  an  error  caused 
by  an  undetected  fault  requires  an  inspection  which  is  assumed  by 
the  model  to  detect  any  previously  undetected  faults.  The  above 
phenomena  can  be  seen  more  clearly  from  the  GSPN  model  of  figure 
8.1(a).  This  also  demonstrates  that  the  GSPN  model  can  facilitate 
the  study  of  system  activities. 

3.3  Modeling  Fault  Diagnosis 

In  this  section,  a  model  for  fault  diagnosis  (fault 
isolation)  will  be  presented  using  SPNs.  The  model  is  defined  by 
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topological  and  functional  descriptions  of  the  system  (the  number 
bf  modules,  connectivity  description,  and  module  functions), 
relative  a  priori  module  failure  probabilities  [PIP  84],  and  the 
specified  normal  range  for  each  module  I/O  value.  The  concept  is 
described  trough  the  following  example. 

Example  8.2:  Consider  a  system  which  consists  of  three  modules  as 
shown  in  Figure  8.3(a).  The  function  of  module  1  is  such  that  an 
abnormal  input  will  still  yield  a  normal  output.  Modules  2  and  3 
are  linear,  i.e.  a  bad  input  causes  a  bad  output.  Figure  3.3(b) 
shows  a  GSPN  model  for  the  fault  diagnosis.  For  each  input  and 
output  of  a  module,  two  places,  say  y^  and  y2#  are  assigned.  The 
subscript  1  stands  for  a  normal  value  while  2  stands  for  an 
abnormal  one. 

Transitions  between  the  input-output  places  of  a  nodule 
represent  data  flow  activities.  Probabilities  assigned  to 
conflicting  transitions  in  each  module  are  computed  from  the 
relative  a  priori  module  failure  probabilities,  which  is  assumed 
to  be  much  less  than  one  .  A  token  in  place  1,  2  or  3  indicates 
that  the  corresponding  module  is  bad.  Given  the  measurments  x 
and  2  *Z2,  the  GSPN  can  be  solved  to  determine  the  probability 
that  a  certain  module  is  faulty.  As  an  example,  suppose  that  a 
priori  failure  probabilities  of  all  modules  are  equal  to  0.1.  The 
distributions  oftokens  in  places  1,  2  and  3  are  as  follows:  130 
with  prob.  0.39,  131  with  prob.  3.01,  313  with  prob,  0.31,  and 
011  with  prob.  0.09,  Therefore,  the  above  measurements  lead  to 
the  conclusion  chat  module  2  is  most  'likely  faulty. 
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The  above  model  can  be  used  as  a  design  for  testability  (OFT) 
model,  in  [BAL  841,  some  OFT  models  based  on  fault  tree,  were 
discussed.  The  complexity  of  which  is  proportional  to  2",  where  n 
is  the  number  of  modules.  On  the  other  hand,  the  complexity  of 
GSPM  is  linearly  proportional  to  n.  Also,  local  changes  in  the 
system  would  require  only  corresponding  local  changes  in  the 

model . 


CHAPTER  9 


CONCLUSIONS 


The  intent  of  this  research  was  to  develop  analytical  models 

for  parallel  processing  systems.  Because  of  the  nature  of  such 

systems,  the  model  has  to  handel  such  phenomena  as  parallel 

synchronous-asynchronous  operations,  sequential  operations, 

contention  for  multiple  resources,  and  queuing.  The  models 

previously  developed,  all  of  which  are  based  on  product  form 

» 

queuing  networks  (PFQN),  are  restricted  to  a  certain  type  of 
parallel  operations  or  another,  namely,  synchronous  or 
asynchronous  operations.  We  considered  the  generalised  stochatic 
Petri  Nets  (GSPNs)  as  an  alternative  modeling  tool  to  model  such 
systems.  Yet  a  GSPN  model  rapidly  becomes  intractable  due  to  the 
state  space  explosion.  Therefore,  a  hierarchical  model  was 
developed  which  utilizes  both  GSPNs  and  PFQNs.  A  GSPN  was  used  to 
model  the  system  workload,  which  comprises  such  activities  as 
sequential,  and  parallel  synchronous-asynchronous  operations.  And 
a  PFQN  was  used  to  model  contention  and  queuing  for  the  system 
resources . 

In  the  second  pact  of  the  dissertation,  the  analysis  of  GSPNs 
was  considered.  A  general  method  of  analysis,  based  on 
identifying  an  isomorphism  between  GSPNs  and  s tochas t i ca 1 ly 
discontinuous  Markov  processes,  was  developed.  This  method, 
though  was  shown  to  be  more  general  than  previously  proposed 
methods,  still  needs  the  generation  of  the  reachability  set  of 
the  GSPN.  Which  grows  rapidly  with  the  number  of  tokens  in  the 
initial  marking  and  the  structure  of  the  GSPN.  Therefore  two 
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^techniques  for  reducing  analysis  complexity  were  developed. 

|  The  first  technique  deals  with  the  decomposition  of  the 

V 

initial  marking  into  several  initial  markings  under  which  a 
l  simplified  analysis  of  the  GSPN  can  be  obtained.  For  example,  the 
■  examples  given  at  the  end  of  chapter  5  with  several  tokens 
initially  in  pi  can  be  analyzed  first  with  only  one  token  in  pi 
to  determine  the  ergodic  classes  and  transient  markings  under 
immediate  transitions,  which  then  can  be  obtained  for  the  general 
case  when  there  is  more  than  one  marking  in  pi.  This  technique 
can  be  applied  to  reduce  the  analysis  complexity  of  a  GSPN  model 
of  the  system  workload  when  the  multiprogramming  level  is  high. 

The  second  technique  is  based  on  aggregation  and  reduction  at 
the  GSPN  level.  Where  subnetworks  of  immediate  transitions  can  be 
aggregated  or  reduced.  This  reduces  the  structure  complexity  of 
GSPNs. 

Approximate  analysis  of  SPNs  by  multiple  time  scale 
decomposition  and  lumping  were  then  considered.  It  was  shown  that 
the  analysis  of  singularly  perturbed  SPNs,  with  transition  firing 
rates  of  different  orders  of  magnitude,  can  be  reduced  to  that  of 
a  hierarchical  sequence  of  aggregated  subnetworks,  each  of  which 
is  valid  at  a  certain  time  scale.  The  Approximate  lumping  of  SPNs 
was  also  considered,  and  an  example  of  obtaining  a  lumped  product 
form  SPN  from  a  non-product  form  SPN  was  given. 

Finally  the  application  of  GSPNs  to  model  systems 
Reliability,  Maintainability,  and  fault  diagnosis  was  considered. 
And  it  was  shown  than  such  models  are  much  more  easier  to 
construct  in  a  compact  form,  especially  for  systems  with  a  large 
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number  of  redundant  components,  than  the  existing  Markov  chain 
models . 


'■'.N 


i 


BIBLIOGRAPHY 


ADAM  73 

BAE  77 

BAER  73 

BAL  34 

3AR  7  7 

3AR  73 

BAR  79 

BAU  73 

BUZ  7  L 

BUZ  33 

CER  72 

CHA  77 


Adams,  D.  A.  "A  Model  for  Parallel  Computations,” 
Parallel  Processor  Systems ,  Technologies  and  Applications 
, Hobbs  led.) ,  Spartan*  Books  ,  New  York ,  T 97117 

Baer,  J.  L.,  and  J.  Jensen  "Simulation  of  Large  Parallel 
Systems:  Modeling  of  Tasks,"  Measuring,  Modeling,  and 
Evaluating  Computer  Systems ,  H.  Beilner,  and  E.  Gelenbe 
(eds.),  North-Holland,  1977. 

Baer,  J.  L.  "A  Survey  of  Some  Theoretical  Aspects  of 
Multiprocessing,"  ACM  Computing  Surveys ,  vol.  5,  No.  1, 
March  1973,  po.  31-83.  " 

Balaban,  H.  S.,  and  W.  R.  Simpson  "Testability  / 
Fault  Isolation  by  Adaptive  Strategy,"  Proceedings  of 
the  1934  Reliabilitv  and  Maintainablilitv  Svmposium, 
I£EE7“T§84. 

3ard,  Y.  "A  Charactr ization  of  VM/370  workloads" 
Modeling  and  Performance  Evaluation  of Computer  Systems, 
H. '  Be  i  lner  ,  and  E.  Gelenbe  (eds.),  North-Holi  and/  197“. 

-  "The  VM/373  Performance  Predictor,"  ACM  Computer 

Surveys ,  vol.' 10,  pp  333-342,  1973. 

-  "Some  Extensions  to  Multi-class  Queuing  Networks 

Analysis,"  Performance  of  Computer  Systems,  M.  Arato, 

A.  Butrimenko,  and  E. Gelenbe  (eds.),  North-Holland,  1979. 

G.  M.  Baudec  "Asynchronous  Iterative  Methods  for 
Multiprocessors,"  Journal  of  the  ACM,  25(2),  pp  225-244, 
1973.  '  “ ' 

B  u  2  e  n ,  J.  P,  "Queuing  Network  Models  of 
Multiprogramming”  Ph.D.  Dissertation,  Harvard  Univ., 
Cambridge,  MA,  1971. 

Buzen,  J,  P.,  and  P.  J.  Dennings  ”  Measuring  and 
Calculating  Queue  Length  Distributions”  Computer , 
vol.  13,  pp.  33-44  ,  1933. 

Cerf ,  V.  "Multiprocessor  Semaphores,  and  a  Graph  Model 
of  Computations,"  ?h.  D.  Thesis,  UCLA,  April  19*72. 

Chandy,  X.  M.,  J.  H.  Howard,  and  D.  F.  Towsley  "Product 
Form  and  Local  Balance  in  Queuing  Networks,”  Journal  of 
the  ACM,  vol.  24,  op  250-263,  1977.  ~  ~~ 


CHA  31  Sauer,  C,  H.,  and  K.  M.  Chandy  Computer  Systems 
Performance  Modeling,  Prentice-Hall,  1981.  ~ 


CHI  35  Chiola,  G.  "A  Software  Package  for  the  Analysis  of 
Generalized  Stochastic  Petri  Mets,"  International 
Conference  on  Timed  Petri  Nets ,  IEEE,  July  1985 


COD  83a  Coderch,  M . ,  A.  S.  Willsky,  S.  S.  Sastry,  and  D.  A. 

Castanon  "H i er a r ch ica 1 Agg r ega t  i  on  of  Linear  Systems 
with  Multiple  Time  Scales,"  IEEE  Transactions  on 
Automatic  Control,  vol.  AC-23,  No. ll,  November  1933. 


CCD  33b  -  "Hierarchical  Aggregation  of  Singularly  Perturbed 

Finite  State  Markov  Processes,”  Stochastics,  vol.  8, 
pp  259-239,  1983. 

CON  63  Conway,  M.  E.  "A  Multi-Processor  System  Design” 
Proceedics  of  the  AFIPS  1963  FJCC,  pp  136-146,  1963. 

CCU  77  Courtois,  ?.  J.  Decomposability:  Queuing ,  and  Computer 
System  Applications,  Academic  Press,  New  York,  1977. 

DEL  33  Qelebecque,  F.,  and  J.  P.  Quadrat  "Optimal  Control  of 

Markov  Chains  Admitting  Strong  and  Weak  I  n ter  act  ions" 
Automacica ,  vcl.  17,  pp  281-296,  1981. 

CEL  34  Delebecque,  J.  P.  Quadrat,  and  ?.  V.  Kokotovic,  "  A 

Unified  Approach  of  Aggregation  and  Coherency  in  Networks 
and  Markov  Chains,"  Private  Communications,  1934. 

DIJo3  Cijkstra,  E.  W.  "Cooperating  Sequential  Processes," 
Programming  Languages,  F.  Genuvs  (ed.),  Academic  Press, 
196  3. 


COL  33  Dolny,  R.  £.,  R.  E.  Fleming,  and  R.  L.  De  Hoff  "Fault- 
Tolerant  Computer  Systems  Cesign  Using  GRAM?," 

Proceed i no  of  the  19  3  3  Reliability  and  Maintainabili tv 
Symposium ,  IEEE,  1983. 

D00  42  Doo’o,  j.  L.  "Topics  in  the  theory  of  Markov  Chains", 
Transactions  of  the  Amer icam  Mathematical  Society, 
vol.  52,  pp.  57-64,  1942. 

DQO  33  -  S  tochas  t ic  Processes ,  Wiley,  New  York,  1953  . 

CYN65  Dynkir. ,  A.  A.  Markov  Processes ,  Spr  inger-Ver  lang, 
5erl in ,  1965. 


FLE  34a  Fleming,  R.  £.,  L.  J.  Colnv,  R.  L.  Ce  Hoff  "Fault- 
Tolerant  Ces ign-To-Specs  with  CRAM?  and  GRAM," 

Proceed i nos  o f  the  19  8  4  Reliability  ar.d  Maintainabli  1  i tv 
Sympos ium ,  IEEE,  1934.  ~ 

F  L  c.  34  b  j.eming,  *  •  — .  Josselyn,  » .  j  .  Dolnv,  R.  Ce  n  o  r  i 


/yy 


*  rj»  i  \_*  y  ^  i 


FLE  35  - 


FLO  84 


GOS  71 


HEI  82 


"EarlyOesign  Tradeoffs  ofTest  5 tra tegyEf f ectiveness 
and  Support  Cost/'  Proceedings  of  AUTOTESTCON,  1984. 

-----  "Complex  System  RMA&T  Using  Markov  Models," 
Proceedings  of  the  1935  Reliability  and  Maitainablity 
Symposium,  iTTES,  19 ST!  " 

Florin,  G.,  and  S.  Natkin  "An  Ergodicity  Condition  for 
a  Class  of  Stochastic  Petri  Nets"  Fifth  Workshop  on 
Application  and  Theory  of  Petri  Nets ,  A  r  h  u  s - ft  e  nma  r  E7 
june  1984.  — — — 

Gostelow,  K.,  and  V.  Cerf,  G.  Estrin,  and  S.  Volansky 
"Proper  Termination  of  Flow  of  Control  in  Programs 
Involving  Concur  rent  Processes ,"  Proceed i nas  of  the 
ACM  National  Conference ,  1972. 

He idelberger ,  P.,  and  K.S.  Trivedi  "Queuing  Network 
Models  for  Parallel  Processing  with  Asynchronous  Tasks," 
IEEE  Transactions  on  Computers,  vol.C-31,  pp  1099-1109, 
November  1982. 


HEI  83 


HEI  34 


HSR  75 


HER  79 


XA2  84 

KAR  56 


-  "Analytical  Queuing  Models  for  Programs  with 

Internal  Concurrency,"  IEEE  Transactions  on  Computers, 
vol  C-32,  January  1983.  '  ~ 

Heidelgerger,  ?.,  and  S.  Laven'oerg  "Computer  Performance 
Evaluation  Methodology,"  IEEE  Transactions  on  Computers, 
vol,  C-33,  December  1984. 

Herzog,  (J.,  L.  Woo,  and  X.  M.  Chandy  "Solution  of  the 
Queuing  Problems  by  a  Recursive  Technique,"  ISM 
Journal  of  Research  and  Development,  vol.  19,  May  1975. 

Herzog,  U.,  W.  Hoffmann,  and  w.  Kleihcder  "Performance 
Modeling  and  Evaluation  for  Hierarchically  Organized 
Multiprocessor  Computer  Systems,"  Proceedings  o f 
the  1979  International  Conference  on  Parallel 
Processing ,  IEEE,  1979. 

Kai  Hwang,  and  F.  Sriggs  Computer  Architecture  and 
Parallel  Processing ,  MpGraw-Hill,  1984. 

Karp,  R.  M.,  and  R.  E.  Miller  "Properties  of  a  Model 
of  Computations:  Determinacy,  Termination,  Queuing," 
SIAM  Journal  of  Asol ied  Mathematics ,  XI7  (November  1966). 


X2I  73  Xeilson,  J.  M  a  k  o  v  Chain  Models-Rari ty 

Sxponentiality ,  Spr inger-Ver lang ,“"New "Yoc k ,  "C5T5 . 


XEL73 


XSM  60 


Wiley,  New  York,  1978. 

Xemeny,  J.  G.,  and  J.  L.  Snell  Finite  Markov  Chains , 


and 

$ 

V 

:  w  o  r  k  s , 

’,v 

Chains , 

M 

KLI  75 

KLI  76 

KUM  76 

KUM  90 

MAS  7  6 

MAR  34 

MOL  81 

MOL  32 

MUR  77 

MAT  80 

PST  74 

PET  75 

PET  3  l 

PHI  31 


UU»M„ 


Van  Nostrand-Reinhold,  Princeton,  New  Jersy,  1960. 

Kleinrock,  L. '  Queuing  Systems,  Vol.  I:  Queuing  Theory, 
Wiley,  Mew  York ,  1975. 

-  Queuing  Systems ,  Vol.  II:  Computer 

Applications,  Wiley,  Mew  York,  1976. 

Rung,  H.  T.  "Synchronous  and  Asynchronous  Parallel 
Algorithms  for  Multiprocessors,"  Algorithms  and 
Complexity:  Mew  Directiona  and  Recent  Results,  J.  F. 
Traub  (ed.).  Academic  Press ,  Mew  York,  1976. 

-  "The  Structure  of  Parallel  Algoriths,"  Advances 

in  Computers,  vol.  19,  M.  S.  Yovits  (ed.),  Academic 
Press,'  1933, 

Maekawa,  M.,  and  D>  L.  Boyd  "Two  Models  of  Task 
Overlap  within  Jobs  of  Multiprocessing  Multiprogramming 
Systems,"  Proceedings  of  the  1976  International 
Conference  on  Parallel  Process inq,  IEEE,  1976. 

Marsan,  M.  A.,  G.  Balbo,  and  G.  Conte.  "A  Class  of 
Generalized  Stochastic  Petri  Nets  for  the  Performance 
Evaluation  ofMul ti pro cessorSys terns , "ACM  Transactions 
on  Computer  Systems ,  vol.  2,  Mo.  2,  May  1984. 

Molloy,  M.  K.  "On  the  Integration  of  Oelay  and 
Throughput  measures  in  Distributed  Processing  Models," 
Ph.D.  Dissertation,  UCLA,  1981. 

— - —  "Performance  Analysis  Using  Stochastic  Petri 

Nets,"  IEEE  Transactions  on  Computers ,  vol.  C-31, 
September  1932, 

Murata,  T.  "State  Equations,  Controlabil ity  and  Maximal 
Matching  of  Petri  Nets,"  IEEE  Transactions  on 
Automatic  Control ,  June  1977. 

Natkin,  S.  Les  Reseaux  da  Petri  Stochastique, 

These  de  Cocteur  Ingenisur,  CMAM-PARIS,  June  1930. 

Peterson,  J.  L.,  and  T.  H.  Bredt  "A  Comparison  of 
Models  of  Parallel  Computations,"  Proceed i ncs  of 
the  IFIP  Congress ,  1974. 

Peterson,  J.  L.,  and  W.  Bulgren  "Studies  in  Markov 
Models  of  Computer  Systems,”  Proceed i ncs  of  the 
ACM  Annual  Conference,  L975. 

Peterson,  J.  L.  Petri  Met  Theory  and  the  Model  inc  o f 
Systems ,  Prentice-Hall,  1980.  '  -  - 

Phillips,  G.  Randolph,  and  Petar  V.  Kokotovic  "A 
Singular  Perturbation  Approach  to  Modeling  and  Control 


-jp 


I 


R 


,\ 

O'. 


'”*™I 


••  rrrF.  Transactions  o_n  Automat ic 

q £  Markov  Chains,  *,.=■&  .  ■  ■  — - v  a  — 

Control ,  vol.  AC-26 ,  Mo.  5,  October  1982. 


PIP  84  Piptone,  F. 


-An  Expert  System  for  Electronics  Trouble 


PRI  75 


RAM  74 


RSI  83 


TOM  S4 


TOW  75 


TOW  73 


„  .  _  f„nff ion  and  Connectivity,” 

Shooting  Based  on  cunction 

Private  Communication, 

-  r  „Models  of  Multi  prog  rammed  Computer  Systems 

mm:*? 

Austin,  1975. 

.  ,  -  “Analysis  of  Asynchronous  Concurrent 

S^rlynVi-<  Ph.D.  Tbesis,  K«.  »’«. 

Project  MAC  report  *  MAC-TR-120. 

•  c  e  r avaohsra  "Mean  Value  Analysis  of 

Reiser,  M.,  and  S.  S.  Lav«n“®cg  ,  «  journal  of  the 

Closed  Multi-chain  Queuing  Net:*orxs,  ^ - 

ACM,  vol.  27,  ?p  313-222,  1983. 

,  anf*  a  Bav  "Queuing  Network  Models  for 
Thomas lan,  A.,  and  ?.  »_ay^  ^ „  Proceedings  of 

Parallel  Processing  of  Task  Systems,  .  PtSShsinc. 

the  1933  international  Conference  on  :  - — 

IEEE,  1983. 

i  n  "Local  Balance  Models  of  Computet  Systems, 
Towsley,  0.  Local  Comouter  Science, 

Ph.D.  Dissertation,  0e?*^?*n\g75 
University  of  Texas,  Aust.n, 

a  n  x  v  chandv,  and  J.  C.  3towne  "Models 
Towsley,  D.,  cnancy,  ss.  aoolications 

yy^ationi-of  ^ 

vol. 21,  ?P  821-331,  19/3. 


>v » 


%  .v 


)  t ' 


i  *4 

t 


DISTRIBUTION  LIST 


M.  E*  Nunn 

W*  Keiner 

J.  Kunert 

T*  Coppola 

B*  Ashmore 

Naval  Air  Systems 

CommundClibrary) 

Air-9330 


Code  9304 
Code  F305 
Code  92512 
Code  RBET 


14  COPIES 
1  copy 


Naval  Ocean  Systems  Center 
San  Diego,  CA.  92152 
Naval  Surface  Weapons  Center 
DahlgrenVA.  22448-5000 
Naval  Air  Engineering  Center 
Lakehurst  N.J. 

Rome  Air  Development  Center 
Griff iss  AFi,  Rome,  N-Y* 

AT AC,  Inc.  1200  Villa  St. 
Mountains  ew,  CA-  94041 

Washington,  D.C«  20361 


R.  Tenney 


Alphatech,  Inc. 

Ill  Middlesex  Turnpike 
Burlington,  MA.  01803 


